I feel the same about Claude Code. It's a fast but average developer at just about everything and there are some things that average developers are just consistently bad at and therefore Claude is consistently bad at.
I'm not sure, I think you overestimate the average developer. But then, the average code doesn't end up in public repositories, it spends decades in enterprise codebases rotting.
At this point I'd rather review LLM generated code than a poor developer's.
That person's actions were only possible because the administration explicitly decided to put that much unchecked power into poorly vetted individuals.
I'm teaching a class in agent development at a university. First assignment is in and I'm writing a human-in-the-loop grader for my TAs to use that's built on top of Claude Agent SDK.
Phase 1: Download the student's code from their submitted github repo URL and run a series of extractions defined as skills. Did they include a README.md? What few-shot examples they provided in their prompt? Save all of it to a JSON blob.
Phase 2: Generate a series of probe queries for their agent based on it's system prompt and run the agent locally testing it with the probes. Save the queries and results to the JSON blob.
Phase 3: For anything subjective, surface the extraction/results to the grader (TA), ask them to grade them 1-5.
The final rubric is 50% objective and 50% subjective but it's all driven by the agent.
It's just "Thou shalt not grow a brain in a test tube and force it to play a 1993 shooter" didn't make any sense to Moses and therefore didn't make the editors cut.
Though I disagree it would be tragic to lose this reference. It’s not a good movie. It’s basically “say thing, immediately interpret it literally”. Throw in some stereotypes from time to time. Rinse and repeat.
I wonder if it has to do with how meaning is tied to the tokens. c+amara+derie (using the official gpt-5 tokenizer).
There's also just that weird thing where they're obsessed with emoji which I've always assumed is because they're the only logograms in english and therefore have a lot of weight per byte.
Right now I'm working two AI-jobs. I build agents for enterprises and I teach agent development at a university. So I'm probably too deep to see straight.
But I think the future of programming is english.
Agent frameworks are converging on a small set of core concepts: prompts, tools, RAG, agent-as-tool, agent handoff, and state/runcontext (an LLM-invisible KV store for sharing state across tools, sub-agents, and prompt templates).
These primitives, by themselves, can cover most low-UX application business use cases. And once your tooling can be one-shotted by a coding agent, you stop writing code entirely. The job becomes naming, describing, and instructing and then wiring those pieces together with something more akin to flow-chart programming.
So I think for most application development, the kind where you're solving a specific business problem, code stops being the relevant abstraction. Even Claude Code will feel too low-level for the median developer.
> The job becomes naming, describing, and instructing and then wiring those pieces together with something more akin to flow-chart programming.
That's precisely what peoples are bad at. If people don't grasp (even intuitively) the concept of finite state machine and the difference between states and logic, LLMs are more like a wishing well (vibes) than a code generator (tooling for engineering).
Then there's the matter of technical knowledge. Software is layers of abstraction and there's already abstraction beneath. Not knowing those will limit your problem solving capabilities.
You think prompting is here to stay? Sql has survived a long period of time. Servlets haven’t. We moved from assembly to higher languages. Flash couldn’t make it. So, im not sure for how long we will be prompting. Sure it looks great right now (just like Flash, servlets and assembly looked back then) but I think another technology will emerge that perhaps is based on promps behind the curtains but doesn’t look like the current prompting.
I would say prompting is not here to stay. It’s just temporary “tech”
Eventually we might even develop some kind of language beyond english. One more precise and formalized. This way the LLM could perfectly understand what we're saying. The LLM could produce code based on that formalized language. And Google docs is nice, but imagine some kind of editor tailored to that formalized language we create.
At this point I just want a decent Helix-Evil-Mode.
reply