> I’ll tell the LLM my main goal (which will be a very specific feature or bugfix e.g. “I want to add retries with exponential backoff to Stavrobot so that it can retry if the LLM provider is down”), and talk to it until I’m sure it understands what I want. This step takes the most time, sometimes even up to half an hour of back-and-forth until we finalize all the goals, limitations, and tradeoffs of the approach, and agree on what the end architecture should look like.
This sounds sensible, but also makes me wonder how much time is actually being saved if implementing a "very specific feature or bugfix" still takes an hour of back and forth with an LLM.
Can't help but think that this is still just an awkward intermediate phase of development with adolescent LLMs where we need to think about implementation choices at all.
Can you be more specific about which math results you are talking about? Looks like significant improvement on FrontierMath esp for the Pro model (most inference time compute).
Are you may be comparing the pro model to the non pro model with thinking? Granted it’s a bit confusing but the pro model is 10 times more expensive and probably much larger as well.
Given how good Apple Silicon is these days, why not just buy a spec'd out Mac Studio (or a few) for $15k (512 GB RAM, 8 TB NVMe), maybe pay for S3 only to sync data across machines. No talent required to manage the gear. AWS EC2 costs for similar hardware would net out in something ridiculous like 4 months.
I recently got assigned to enhance some code I've never seen before. The code was so bad that I'd have to fully understand it and change multiple places to make my enhancement. I decided that if I was going to be doing that anyway, I might as well refactor it into a better state first. It feels so good to make things better instead of just making them do an extra thing.
More often than not I've seen this be the case. Refactoring as "rewrite using my idiomatic style, so that I can understand it", which does not scale across the team so the next engineer does the same thing.
Most of my colleagues are content to spend 50 hours chopping up the tree with a pipe. We don't have time to spend making things work properly! This tree has to be finished by tomorrow! Maybe after we've cut up this forest, then we'll have a bit of spare time to sharpen things.
As Charlie Munger used to say “show me the incentives and I’ll show you the outcome”.
What are the incentives for these developers? Most businesses want trees on trucks. That’s the only box they care to check. There is no box for doing it with a sharp axe. You might care, and take the time to sharpen all the axes. Everyone will love it, you might get a pat on the back and a round of applause, but you didn’t check any boxes for the business. Everyone will proceed to go through all the axes until they are dull, and keeping chopping anyway.
I see 2 year old projects that are considered legacy systems. They have an insurmountable amount of technical debt. No one can touch anything without breaking half a dozen others. Everyone who worked on it gets reasonable rewarded for shipping a product, and they just move on. The business got its initial boxes checked and everyone who was looking for a promotion got it. What other incentives are there?
It's not about incentives; it's just bad management. As you said, the business just wants trees on trucks, so good management would realise that you need to spend some time sharpening axes to get trees on trucks quickly. It just seems to be something that a lot of software managers don't get.
I don't think every company is like this though. E.g. Google and Amazon obviously have spent a mountain of time sharpening their own axes. Amazon even made an axe so sharp they could sell it to half the world.
Early on in Amazon’s history (long before same day shipping), they added a feature that would tell you, on a product page, whether you had recently bought that same product. The metrics spoke loud and clear: it caused purchase count to go down. Human common sense about the customer’s experience overruled the data and they have some variation of that feature to this day. That’s the “customer obsession,” but unfortunately most businesses only copy the “data driven”.
There is some amount of time to spend on sharpening that, if you spend either more or less time sharpening, net amount of trees on trucks goes down. Smart businesses look for that amount. Really smart businesses know what the amount is, and make sure that they spend very close to that amount of time sharpening.
Indeed. My point is that that right amount is waaaaay more than most people think it is. At least in my experience.
I think part of the problem is people get... I guess "speed blindness". When stuff is taking ages they just think that's how long it takes. They don't realise that they could be twice as fast if they spent some of their time fixing & improving their tooling.
This approach is also what I'm still missing in agentic coding.
It's even worse there because the AI can churn out code and never think "I've typed the same thing 5x now. This can't be right.".
So they never make the change easy because every change is easy to them... until the lack of structure and re-use makes any further changes almost impossible.
This is a great observation. I've noticed the same pattern with AI-generated code and deployment configs. Ask it to set up a Node.js service and it will happily write a new PM2 ecosystem file every time rather than noticing you already have one with shared configuration.
The "make the change easy first" mindset requires understanding what already exists, which is fundamentally a compression/abstraction task. Current models are biased toward generation over refactoring because generating new code has a clearer reward signal than improving existing structure. Until that changes, the human still needs to be the one saying "stop, let's restructure this first."
I have mixed feelings here because on one hand I prefer the “axe” when programming (vim with only the right extensions and options). But for trees… chainsaws are quite a bit easier. Once it is chopped down, maybe rent a wood splitter.
Sometimes sharpening the axe means breaking it completely for people still trying to cut down trees on WinXP, but you don't know that because you can't run those tests yourself, and grovelling through old logs shows nobody else has either since 2017 so it's probably no big deal.
Sometimes it's not clear which part is actually the cutting blade, and you spend a long time sharpening something else. (If you're really unlucky: the handle.)
> Of course, ai will continue to improve. But it is also likely to get more expensive…Microsoft’s share price took a beating last week as investors winced at its enormous spending on the data centres underpinning the technology. Eventually these companies will need to demonstrate a return on all that investment, which is bound to mean higher prices.
That’s not how it works, and I’m surprised this fallacy made it into the Economist.
Commodity producers don’t get to choose the price they charge by wishful thinking and aspirational margins on their sunk costs. Variable cost determines price. If all the cloud companies spend trillions on GPUs, GPU rental price (and model inference cost) will continue going down.
Indeed, cheaper and cheaper AI for the same level of performance has been even more consistent empirically than improvement in frontier model performance.
> I’m surprised this fallacy made it into the Economist.
Fallacies in the Economist, you say without irony?
From TFA:
> Yet investors risk misdiagnosing the industry’s troubles.
Oh, I think investors are asking, "where is the actual gain in capability and/or productivity?" Because they don't see it. And huge lay-offs don't prove it.
> Of course, AI will continue to improve.
This is begging the question. However, for the sake of argument, let's assume that AI continues to improve.
The thing we call AI is a long way from improving robotics. It's most prevalent practical value right now is improved search and code/text information assistance. But these improvements themselves have proven to be far from perfect.
The only monetisable path that we have seen in search and information assistance is advertising. The AI boom walks and quacks like a duck: a bubble, and a dawning generation of new facets of enshittification.
Good instinct. A lot of the day to day is debugging nitty things, reconciling small differences in results, trying not to make dumb mistakes. Almost all attempts to do very smart theoretical novel work fail, often because of extremely mundane engineering and data issues.
There’s a great quote from Nick Patterson of RenTech who says that the most sophisticated technique they generally used was linear regression, and the main thing was avoiding stupid mistakes:
“I joined a hedged fund, Renaissance Technologies, I'll make a comment about that. It's funny that I think the most important thing to do on data analysis is to do the simple things right. So, here's a kind of non-secret about what we did at renaissance: in my opinion, our most important statistical tool was simple regression with one target and one independent variable. It's the simplest statistical model you can imagine. Any reasonably smart high school student could do it. Now we have some of the smartest people around, working in our hedge fund, we have string theorists we recruited from Harvard, and they're doing simple regression. Is this stupid and pointless? Should we be hiring stupider people and paying them less? And the answer is no. And the reason is nobody tells you what the variables you should be regressing [are]. What's the target. Should you do a nonlinear transform before you regress? What's the source? Should you clean your data? Do you notice when your results are obviously rubbish? And so on. And the smarter you are the less likely you are to make a stupid mistake. And that's why I think you often need smart people who appear to be doing something technically very easy, but actually usually not so easy.”
Moved to NYC. Have a good team at my new job. Satisfied with my income. Have enough free time. Made a lot of good friends really fast, and now I see a rotating cast of them 3-7 days a week. Happy with my apartment. Have an east facing window so I don't have to set an alarm to wake up in the morning, I wake with the sunrise. Getting plenty of exercise and walking a 8-12k steps a day.
Overcomplicated take. Burn out comes from lacking a feeling of forward progress and tractability to your problems, regardless of current objective state.
That is part of it but there is also something to be said about what is going on biochemically IMO. Even if you are feeling forward progress and comfortable about the scope of your problems, if you give yourself no time to rest and get out of a subconciously anxious state, that isn't very good.
Anxiety is meant to have your senses heightened to perhaps hear the tiger stalking you and encourage you to seek out a safer environment where you can comfortably rest. You aren't built to be in an anxious state for such extended periods of time. The tiger would have gotten you by then, with the way this system was designed. You aren't built to constantly run from the tiger.
This sounds sensible, but also makes me wonder how much time is actually being saved if implementing a "very specific feature or bugfix" still takes an hour of back and forth with an LLM.
Can't help but think that this is still just an awkward intermediate phase of development with adolescent LLMs where we need to think about implementation choices at all.
reply