Let's be real. The sky is blue because God thought it was a pretty color, simple as. All this stuff about wavelengths and resonant frequencies and human color perception got retconned into the physics engine at some point in the past millennium, that's why all these epicycles are needed.
Centaurs are a transient phenomenon. In chess, the era of centaur supremacy lasted only about a decade before computers alone eclipsed human+computer. The same will be true in every other discipline.
You can surf the wave, but sooner or later, the wave will come crashing down.
They are transient only in those rare domains that can be fully formalized/specified. Like chess. Anything that depends on the messy world of human - world interactions will require humans in the loop for translation and verification purposes.
>Anything that depends on the messy world of human - world interactions will require humans in the loop for translation and verification purposes.
I really don't see why that would necessarily be true. Any task that can be done by a human with a keyboard and a telephone is at risk of being done by an AI - and that includes the task of "translation and verification".
Sure, but at the risk of running into completely unforeseen and potentially catastrophic misunderstandings. We humans are wired to use human language to interact with other humans, who share our human experience, which AIs can only imperfectly model.
I have to say I don't feel this huge shared experience with many service industry workers. Especially over the phone. We barely speak the same language!
Mathematics is indeed one of those rare fields where intimate knowledge of human nature is not paramount. But even there, I don't expect LLMs to replace top-level researchers. The same evolutionary "baggage" which makes simulating and automating humans away impossible is also what enables (some of) us to have the deep insight into the most abstract regions of maths. In the end it all relies on the same skills developed through millions of years of tuning into the subtleties of 3D geometry, physics, psychology and so on.
I'm guessing that they were referring to the depth of the decision tree able to be computed in a given amount of time?
In essence, it used to be (I have not stayed current) that the "AI" was limited on how many moves into the future it could use to determine which move was most optimal.
That limit means that it is impossible to determine all the possible moves and which is guaranteed to lead to a win. (The "best" than can be done is to have a Machine Learning algorithm choose the most likely set of moves that a human would take from the current state, and which of that set would most likely lead to a win.
On the other hand, chess is not very financially rewarding. IBM put some money into it for marketing briefly, but that’s probably equal to about five minutes of spend from the current crop of LLM companies.
As far as I can tell based on scanning forums, to the extent humans contribute anything to the centaur setup, it is entirely in hardware provisioning and allocating enough server time before matches for chess engines to do precomputation, rather than anything actually chess related, but I am unsure on this point.
I have heard anecdotally from non-serious players (and therefore I cannot be certain that this reflects sentiment at the highest levels although the ICCF results seem to back this up) that the only ways to lose in centaur chess at this point is to deviate from what the computer tells you to do, either intentionally or unintentionally by accidentally submitting the wrong move, or simply by being at a compute disadvantage.
I've got several previous comments on this because this is a topic that interests me a lot, but the two most topical here are the previous one and https://news.ycombinator.com/item?id=33022581.
The last public ranking of chess centaurs was 2014, after which it is generally held to be meaningless as the ranking of a centaur is just the same as the ranking of the engine. Magnus Carlsen’s peak elo of 2884 is by far the highest any human has ever achieved. Stockfish 18 is estimated to be in excess of 4000 elo. Which is to say the difference between it and the strongest human player ever is about the same as the difference between a strong club player and a grandmaster. It’s not going to benefit meaningfully from anything a human player might bring to the partnership.
Magnus himself in 2015 said we’ve known for a long time that engines are much stronger than humans so the engine is not an opponent.
I'm highly worried that you are right. But what gives me hope is that people still play chess, I'd argue even more than ever. People still buy paper books and vinyl records. People still appreciated handwritten greeting cards over printed ones, pay extra to listen to live music where the recorded one is free and will likely sound much better. People are willing to pay an order of magnitude more for a sit in a theater for a live play, or pay premium for handmade products over their almost impossible to distinguish knock offs.
Just wait. In a few years we'll have computer-use agents that are good enough that people will stop making APIs. Why bother duplicating that effort, when people can just direct their agent to click around inside the app? Trillions of matmuls to accomplish the same result as one HTTP request.
This strikes me as a very agent-friendly problem. Given a harness that enforces sufficiently-rigorous tests, I'm sure you could spin up an agent loop that methodically churns through these functions one by one, finishing in a few days.
Have you ever used an LLM with Zig? It will generate syntactically invalid code. Zig breaks so often and LLMs have such an eternally old knowledge cutoff that they only know old ass broken versions.
The same goes for TLA+ and all the other obscure things people think would be great to use with LLMs, and they would, if there was as much training data as there was for JavaScript and Python.
i find claude does quite well with zig. this project is like > 95% claude, and it's an incredibly complicated codebase [0] (which is why i am not doing it by hand):
[0] generates a dynamically loaded library which does sketchy shit to access the binary representation of datastructures in the zig compiler, and then transpiles the IR to zig code which has to be rerun to do the analysis.
To be fair, this was true of early public LLMs with rust code too. As more public zig repositories (and blogs / docs / videos) come online, they will improve. I agree it's a mess currently.
A little bit! I wrote a long blog post about how I made it, I think the strategy of having an LLM look at individual std modules one by one make it actually pretty accurate. Not perfect, but better than I expected
Try it again. This time do something different with CLAUDE.md. By the way it's happy to edit its own CLAUDE.md files (don't have an agent edit another agent's CLAUDE.md files though [0])
My take: Any gains from an "LLM-oriented language" will be swamped by the massive training set advantage held by existing mainstream languages. In order to compete, you would need to very rapidly build up a massive corpus of code examples in your new language, and the only way to do that is with... LLMs. Maybe it's feasible, but I suspect that it simply won't be worth the effort; existing languages are already good enough for LLMs to recursively self-improve.
Blackpill is that, for this reason, the mainstream languages we have today will be the final (human-designed) languages to be relevant on a global scale.
Eventually AIs will create their own languages. And humans will, of course, continue designing hobbyist languages for fun. But in terms of influence, there will not be another human language that takes the programming world by storm. There simply is not enough time left.
My impression is that AI models need large amounts of quality training data. "Data contamination", i.e. AI output in the training data set has been a problem for years.
The skill isn’t being right. It’s entering discussions to align on the problem.
Clarity isn’t a style preference - it’s operational risk reduction.
The punchline isn’t “never innovate.” It’s “innovate only where you’re uniquely paid to innovate.”
This isn’t strictly about self-promotion. It’s about making the value chain legible to everyone.
The problem isn’t that engineers can’t write code or use AI to do so. It’s that we’re so good at writing it that we forget to ask whether we should.
This isn’t passive acceptance but it is strategic focus.
This isn’t just about being generous with knowledge. It’s a selfish learning hack.
Insist on interpreting trends, not worshiping thresholds. The goal is insight, not surveillance.
Senior engineers who say “I don’t know” aren’t showing weakness - they’re creating permission.
There’s some really solid insights here, but the editing with AI to try to make up for an imperfect essay just makes the points they’re trying to convey less effective.
The lines between what is the author’s ideas and what is AI trying to finish a half or even mostly baked idea just removes so much of the credibility.
And it’s completely counter to the “clarity vs cleverness” idea and the just get something out there instead of trying to get it perfect.
Thank you for doing this. It allowed me to skip reading the article altogether immediately knowing it is AI generated slop. Usually I'm a little ways into it before my LLM detector starts going off, but these "This isn't X. It's Y." phrases are such a dead giveaway.
This is conflating two things: The stuck, and the suck.
As the author says, the time you spend stuck is the time you're actually thinking. The friction is where the work happens.
But being stuck doesn't have to suck. It does suck, most of the time, for most people; but most people have also experienced flow, where you are still thinking hard, but in a way that does not suck.
Current psychotechnology for reducing or removing the suck is very limited. The best you can do is like... meditate a lot. Or take stimulants, maybe. I am optimistic that within the next few decades we will develop much more sophisticated means of un-suckifying these experiences, so that we can dispense with cope like "it's supposed to be unpleasant" once and for all.
You certainly do not need to play music at the speed the performer intended! There are whole genres (and subgenres) based on this. :) Personally, I have found that slowing a familiar piece down by ~5% tricks my brain into perceiving it as novel again, which helps me attend to it more closely and appreciate it more.
reply