The thing that made our crowd apart in the work society is now vanishing, that’s sad.
I wonder how future movies will depict programmers: depressed faces getting angrier and angrier chatting with a CLI coding agent! This will not inspire future generations!
Your daughter is so lucky! I meaning the physical UX is very reminiscent of teenage engineering, looks great!
The more I was scrolling down the article the more I hoped for an « order » button :)
Yes using Python UDFs within Spark pipelines are a hog! That’s because the entire Python context is serialized with cloudpickle and sent over the wire to the executor nodes! (It can represent a few GB of serialized data depending on the UDF and driver process Python context)
Laketower: https://github.com/datalpia/laketower
A lightweight data lakehouse exploration and management app (web+cli), using DuckDB as the default query engine. It can run locally or self hosted, and for now statically configured only. Hope to integrate Iceberg and Ducklake support by end of year.
Modelship: https://github.com/datalpia/modelship
An ML model to app generator. For now, only ONNX models are supported as input, and only static website as target (onnx runtime web wasm/webgpu). I intend to also work more on it the following weeks/months, especially to support more model I/O types, and add support for more targets (REST API, CLI, etc).
These 2 projects were born from professional activity needs but are a nice playground to learn and try new things
- Users are confused by autogenerated docs and don’t even want to try using a project because of it
- Real curated project documentation is no longer corrected by users feedback (because they never reach it)
- LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)
> LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)
I wonder what incentives for adherence to the use of this meta-tag might exist? For example, imagine I send you my digital resume and it has an AI-generated footer tag on display? Maybe a bad example- I like the idea of this in general, but my mind wanders to the fact that large entities completely ignored the wishes of robots.txt when collecting the internet's text for their training corpuses
Large entities aside, I would use this to mark my own generated content. Would be even more helpful if you could get the LLM to recognise it which would allow you to prevent ouroboros situations.
Also, no one is reading your resume anymore and big corps cannot be trusted with any rule as half of them think the next-word-machine is going to create God.
> This separation also avoids the threading and memory-safety limitations that would arise from embedding DuckDB directly inside the Postgres process, which is designed around process isolation rather than multi-threaded execution. Moreover, it lets us interact with the query engine directly by connecting to it using standard Postgres clients.
Curious to know more about the commercial licensing scheme for Yaak: if i’ve read correctly, purchasing a pro license if based on « good faith » as the features are exactly the same as the MIT licensed Hobby version?
Sincere question, been studying lots of OSS commercial licensing and always wonder what works in which context
Yes, it's a good-faith license. The license doesn't even apply to the OSS version (only prebuilt binaries).
The bet is that super fans will pay for it in the early days and, as it gets adopted by larger companies, they will pay in order to comply with the legalities of commercial use. So far, it's working! The largest company so far is 34 seats, with a couple more in the pipe!
You can be an Oracle and audit your customers and develop that adversarial relationship. The idea is that that sort of thing makes you rot in the long run.
Pretty poorly actually, people avoid Oracle products like the plague. Nobody is buying a JVM from Oracle or buying their DB - they're using open source solutions that are both free and provide more features.
They have a lot of inerita, but that's it. If you're in Greenfield development, there is a close to 0% chance you will choose Oracle as your RDBMS.
Hey, personally I agree. Why would I ever go with Oracle.
But that's A) me personally and B) me in Cloud/Startup type companies, so of course we don't got with Oracle.
But like you mentioned, inertia. So my previous gigs that were large multi-national of course were all Oracle. And they were all huge and had zero reason to not just buy the Oracle tax. Which is why Oracle is going strong.
Despite all the rage, Oracle can still survive quite some time on running boring things like I don't know, many large banks and other boring old businesses. Which of those is really gonna go "AWS Aurora MySQL" when the have had an in-house "Oracle Exadata" run their entire business operation "just fine" for longer than those Cloud providers have even be around?
I am sure everyone making shareware in the early 1990's would have loved to spy on people to know how many used their software for free (and have a way to spam those users to try to sell more licenses), but they couldn't and just did without that.
Thank you for your honest and detailed answer! Great to see it’s working so far and this allows you to build a true OSS product in the meantime, i really appreciate that (i think this is the biggest benefit of your licensing scheme)
Under pricing for the hobby tier you could add as free or pay what you want. $50/yr isn't crazy but might get a few smaller donations if that was an avenue.
If I asked my security team could I use yaak, they would (probably) say yes, and legal would say under no circumstances am I to use a personal license, they will pay for a commercial license. Large companies are incredibly risk averse when it comes to stuff like this.
When you get to think about branchless programming, especially for SIMD optimizations in the real world, you always learn a lot and it’s as if you get a +1 level on your algorithmic skills. The hardest part then is make sure the tricks are clearly laidout so that someone else can take it from here next time
It also triggered some memories for me too. A college professor wanted to teach all the bit manipulating stuff and gave an assignment where students had to transform branchy code into branchless code using shifts and bit operators. Had a lot of fun doing that.
Yeah this is the kind of thinking opening up new heights when you get it!
These type of exercises should be mandatory for all compute intensive related jobs, especially for all data science people (i know some of them know this stuff but that does not seem to be the majority)
> Today, AI types most of our code. We think, they type. Rather than diminishing our value, this shift amplifies it as thinkers — especially for those who love architecture
This way you get to experience the job of a senior principal solution architect: thinking about big ideas, and letting the engineering workforce build it and trying to make a square enter a hole…
Irony apart, been using on and off claude code for 3 months, tech is crazy already but… pretty sure there is no real acceleration (time spent dreaming and prompting count don’t get fooled), and the feeling of accomplishement to implement a feature is gone for me. So maybe i’d rather enjoy doing the tech myself and only use it as a very powerful stack overflow like q&a
The thing that made our crowd apart in the work society is now vanishing, that’s sad.
I wonder how future movies will depict programmers: depressed faces getting angrier and angrier chatting with a CLI coding agent! This will not inspire future generations!