I no longer drink in pubs but in my neck of the woods, the pubs that specialised in cask ale often had lined glasses.
The problem was that many people insisted on the glass being filled to the brim, because they felt they were being short changed. So it solved one problem but created another.
Yes, there are a few tools that do this. Looking at /bin and the softlinks that are there, the various xz tools do it (unxz, lzcat, etc.). Also, vim. vimdiff and view are just softlinks to vim.
The only difference is that those tools have chosen easy to remember names rather than embedding the arguments as metadata in the filename.
As a generalisation of the idea though, the blog post is neat.
exiftool can embed options to executable name, not only main mode of work like grep/egrep/zgrep — it is main difference. Like running `exiftoo(-k)` is equivalent `exiftool -k`.
I think the split is between people who are in a hurry and those who are not. I'm not in a hurry and so choose not to spend money to get a quicker result.
Taking time to solve a problem myself is pleasurable and I make no apologies for that.
I've heard people say that these coding agents are just tools and don't replace the thinking. That's fine but the problem for me is that the act of coding is when I do my thinking!
I'm thinking about how to solve the problem and how to express it in the programming language such that it is easy to maintain. Getting someone/something else to do that doesn't help me.
But different strokes for different folks, I suppose.
I'm similar, but I do find some natural places where LLMs can be helpful.
Just today I was working on something that involves a decent amount of configuration. It's in Python unfortunately and I hate passing around dictionaries for configs, I usually like to parse the JSON or YAML or whatever into a config class so I have a natural way to validate and access without just throwing strings around.
As I was playing with the code for the actual work that needs to be done, I was thinking what configs I needed and what structure made sense. Once I knew what I needed I gave the JSON to an LLM with some instructions regarding helper functions and told it to give me the appropriate Python code. It's just a bunch of dataclasses with some from_dict or from_string methods on them, not interesting or difficult to write. Freed me up to keep working on the real problem.
"All of them are moving into the direction of "less human involved and agents do more", while what I really want is better tooling for me to work closer with AI and be better at reviewing/steering it, and be more involved."
I want less ambitious LLM powered tools than what's being offered. For example, I'd love a tool that can analyse whether comments have been kept up to date with the code they refer to. I don't want it to change anything I just want it to tell me of any problems. A linter basically. I imagine LLMs would be a good foundation for this.
Any terminal tool like Claude Code or Codex (I assume OpenCode too, but I haven't tried) can do it, by using as a prompt pretty much exactly what you wrote, and if it still wants to edit, just don't approve the tool calls.
One problem I've noticed is that both claude models and gpt-codex variants make absolutely deranged tool calls (like `cat <<'EOF' >> foo...EOF` pattern to create a file, or sed to read a couple lines), so it's sometimes hard to see what is it even trying to do.
"Any terminal tool like Claude Code or Codex (I assume OpenCode too, but I haven't tried) can do it, by using as a prompt pretty much exactly what you wrote, and if it still wants to edit, just don't approve the tool calls."
I'm sure it can. I'd still like a single use tool though.
But that's just my taste. I'm very simple. I don't even use an IDE.
edit: to expand on what I mean. I would love it if there was a tool that has conquered the problem and doesn't require me to chat with it. I'm all for LLMs helping and facilitating the coding process, but I'm so far disappointed in the experience. I want something more like the traditional process but using LLMs to solve problems that would be otherwise difficult to solve computationally.
I’m glad I’m not the only one who’s noticed these seemingly arbitrary calls to write files using the cat command instead of the native file edit capabilities of the agent.
The golang.org/x/ namespace is the other half of the standard library in all but name. That gets iterated often.
For stuff in the standard library proper, the versioning system is working well for it. For example, the json library is now at v2. Code relying on the original json API can still be compiled.
Interesting. I've only dipped my toe in the AI waters but my initial experience with a Go project wasn't good.
I tried out the latest Claude model last weekend. As a test I asked it to identify areas for performance improvement in one of my projects. One of the areas looked significant and truth be told, was an area I expected to see in the list.
I asked it to implement the fix. It was a dozen or so lines and I could see straightaway that it had introduced a race condition. I tested it and sure enough, there was a race condition.
I told it about the problem and it suggested a further fix that didn't solve the race condition at all. In fact, the second fix only tried to hide the problem.
I don't doubt you can use these tools well, but it's far too easy to use them poorly. There are no guard rails. I also believe that they are marketed without any care that they can be used poorly.
Whether Go is a better language for agentic programming or not, I don't know. But it may be to do with what the language is being used for. My example was a desktop GUI application and there'll be far fewer examples of those types of application written in Go.
You need to be telling it to create reproduction test cases first and iterate until it's truly solved. There's no need for you to manually be testing that sort of thing.
The key to success with agents is tight, correct feedback loops so they can validate their own work. Go has great tooling for debugging race conditions. Tell it to leverage those properly and it shouldn't have any problems solving it unless you steer it off course.
I do have a test harness. That's how I could show that the code suggested was poor.
If you mean, put the LLM in the test harness. Sure, I accept that that's the best way to use the tools. The problem is that there's nothing requiring me or anyone else to do that.
If that’s what you have to do that makes LLMs look more like advanced fuzzers that take textual descriptions as input (“find code that segfaults calling x from multiple threads”, followed by “find changes that make the tests succeed again”) than as truly intelligent. Or, maybe, we should see them as diligent juniors who never get tired.
I don't see any problems with either of those framings.
It really doesn't matter at all whether these things are "truly intelligent". They give me functioning code that meets my requirements. If standard fuzzers or search algorithms could do the same, I would use those too.
I accept what you say about the best way to use these agents. But my worry is that there is nothing that requires people to use them in that way. I was deliberately vague and general in my test. I don't think how Claude responded under those conditions was good at all.
I guess I just don't see what the point of these tools are. If I was to guide the tool in the way you describe, I don't see how that's better than just thinking about and writing the code myself.
I'm prepared to be shown differently of course, but I remain highly sceptical.
Just want to say upfront: this mindset is completely baffling to me.
Someone gives you a hammer. You've never seen one before. They tell you it's a great new tool with so many ways to use it. So you hook a bag on both ends and use it to carry your groceries home.
You hear lots of people are using their own hammers to make furniture and fix things around the home.
Your response is "I accept what you say about the best way to use these hammers. But my worry is that there is nothing that requires people to use them in that way."
These things are not intelligent. They're just tools. If you don't use a guide with your band saw, you aren't going to get straight cuts. If you want straight cuts from your AI, you need the right structure around it to keep it on track.
Incidentally, those structures are also the sorts of things that greatly benefit human programmers.
"These things are not intelligent. They're just tools."
Correct. But they are being marketed as being intelligent and can easily convince a casual observer that they are through the confidence of their responses. I think that's a problem. I think AI companies are encouraging people to use these tools irresponsibly. I think the tools should be improved so they can't be misused.
"Incidentally, those structures are also the sorts of things that greatly benefit human programmers."
Correct. And that's why I have testing in place and why I used it to show that the race condition had been introduced.
"Okay. If you’re being vague, you get vague results."
No. I was vague and got a concrete suggestion.
I have no issue with people using Claude in an optimal way. The problem is that it's too easy to use in a poor way.
My example was to test my own curiosity about whether these tools live up to the claims that they'll be replacing programmers. On the evidence I've seen I don't believe they will and I don't see how Go is any different to any other language in that regard.
IMO, for tools like Claude to be truly useful, they need to understand their own limitations and refuse to work unless the conditions are correct. As you say, it works best when you tell it precisely what you want. So why doesn't Claude recognise when you're not being precise and refuse to work until you are?
To reiterate, I think coding assistants are great when used in the optimal way.
If only there was a way to prevent race conditions by design as part if the language's type system, and in a way that provides rich and detailed error messages that allow coding agents to troubleshoot issues directly (without having to be prompted to write/run tests that just check for race conditions).
The problem was that many people insisted on the glass being filled to the brim, because they felt they were being short changed. So it solved one problem but created another.
reply