> The more time I spend in a codebase the better idea I have of what the writer was trying to do.
This whole thing of using LLMs to Code reminds me a bit of when Google Translate came out and became popular, right around the time I started studying Russian.
Yes, copying and pasting a block of Russian text produced a block of english text that you could get a general idea of what was happening. But translating from english to russian rarely worked well enough to fool the professor because of all the idioms, style, etc. Russian has a lot of ways you can write "compactly" with fewer words than english and have a much more precise meaning of the sentence. (I always likened russian to type-safe haskell and english to dynamic python)
If you actually understood Russian and read the text, you could uncover much deeper and subtle meaning and connections that get lost in translation.
If you went to russia today you could get around with google translate and people would understand you. But you aren't going to be having anything other than surface level requests and responses.
Coding with LLMs reminds me a lot of this. Yes, they produce something that the computer understands and runs, but the meaning and intention of what you wanted to communicate gets lost through this translation layer.
Coding is even worse because i feel like the intention of coding should never to be to belt out as many lines as possible. Coding has powerful abstractions that you can use to minimize the lines you write and crystalize meaning and intent.
This presumes that it will be real humans that have to “take care” of the code later.
A lot of the people that are hawking AI, especially in management, are chasing a future where there are no humans, because AI writes the code and maintains the code, no pesky expensive humans needed. And AI won’t object to things like bad code style or low quality code.
I think this is where we lose a lot of developers. My experience has been this is a skill set that isn’t as common as you’d hope for, and requires some experience outside developing software as its own act. In other words, this doesn’t seem to be a skill that is natural to developers who haven’t had (or availed themselves of) the opportunity to do the business analyst and requirements gathering style work that goes into translating business needs to software outcomes. Many developers are isolated (or isolate themselves) from the business side of the business. That makes it very difficult to be able to translate those needs themselves. They may be unaware of and not understand, for example, why you’d want to design an accounting feature in a SaaS application in a certain way to meet a financial accounting need.
On the flip side, non-technical management tends to underestimate and undervalue technical expertise either by ego or by naïveté. One of my grandmothers used to wonder how I could “make any money by playing on the computer all day,” when what she saw as play was actually work. Not video games, mind you. She saw computers as a means to an end, in her case entertainment or socializing. Highly skilled clients of mine, like physicians, while curious, are often bewildered that there are sometimes technical or policy limitations that don’t match their expectations and make their request untenable.
When we talk about something like an LLM, it simply doesn’t possess the ability to reason, which is precisely what is needed for that kind of work.
> 2. Letting the Definition of Done Include Customer Approval
In the olden days we used to model user workflows. Task A requires to do that and that, we do this and this and transition to workflow B. Acceptance testing was integral step of development workflow and while it did include some persuasion and deception, actual user feedback was part of the process.
As much as scrum tries to position itself as delivering value to the customer, the actual practices of modeling the customer, writing imagined user stories for them and pulling acceptance criteria out of llama's ass ensures that actual business and development team interact as little as possible. Yes, this does allow reduce the number of implemented features by quite a bunch, however by turning the tables and fitting a problem inside the solution.
Think about it, it is totally common for website layouts to shift, element focus to jump around as the website is loading or more recently two step login pages that break password managers. No sane user would sign these off, but no actual user has participated in the development process, neither at design phase, nor at testing phase.
You know this future isn't happening anytime soon. Certainly not in the next 100 years. Until then, humans will be taking care of it and no one will want to work at a place working on some Fransketeinian codebase made via an LLM. And even when humans are only working on 5% of the codebase, that will likely be the most critical bit and will have the same problems regarding staff recruitment and retention.
I think this is a bit short sighted, but I’m not sure how short. I suspect in the future, code will be something in between what it is today, and a build artifact. Do you have to maintain bytecode?
> Russian has a lot of ways you can write "compactly" with fewer words than english and have a much more precise meaning of the sentence. (I always likened russian to type-safe haskell and english to dynamic python)
Funny; my experience has been completely the opposite. I've always envied the English language for how compactly and precisely it can express meaning compared to Russian, both because of an immensely rich vocabulary, and because of the very flexible grammar.
I suspect this difference in perception may be due to comparing original texts, especially ones produced by excellent writers or ones that have been polished by generations that use them, to translations, which are almost invariably stylistically inferior to the original: less creative, less playful, less punchy, less succinct. So, if you translate a good Russian writer who is a master of his craft into English, you may feel the inadequacy of the language. Likewise, whenever I try to read translations of English prose into Russian, it reads clumsy and depressingly weak.
Translating is an interpretation of the original text. A translated book can be better than the original. But you often need mastery of the language you translate to.
> A translated book can be better than the original.
Can you give some examples?
> But you often need mastery of the language you translate to.
Professional written translation is virtually always done into your native language, not into a language you've learned later. So that mastery should be taken for granted; it's a prerequisite.
Coding isn't the hard part. The hard part is translating the business needs in code.
You can tell a junior programmer "Make a DB with tables book, author, has Written, customer, stock, hasBought, with the following rules between them. Write a repository, for that DB. Use repository in BooksService and BasketService. Use those services in Books controller and Basket controller." and he will do a fine job.
Ask the junior to write an API for a book store and he will have a harder time.
I write in Clojure and "coding" is perhaps 10% of what I do? There is very little "plumbing", or boilerplate, most of the code directly addresses the business domain. "AI" won't help me with that.
This is a great analogy. I find myself thinking that by abstracting the entire design process when coding something using generative AI tools, you tend to lose track of fine details by only concentrating on the overall function.
Maybe the code works, but does it integrate well with the rest of the codebase? Do the data structures that it created follow the overall design principles for your application? For example, does it make the right tradeoffs between time and space complexity for this application? For certain applications, memory may be an issue and while code the may work, it uses too much memory to be useful in practice.
These are the kind of problems that I think about, and it aligns with your analogy. There is in fact something "lost through this translation layer".
I think translating to russian wasn't worse than translating to English because "russian is more compact".
Probably it was worse, because people in charge in Google speak English. It was embarrassing to watch Google conferences where they proposed Google Translate to translate professional products. It's similarly embarrassing watching people proposing chatGPT lightly, because they lack the ability, or probably just don't care to, analyze the problem thoroughly.
I was helping translate some UI text for a website from English to German, my mother tongue. I found that usually the machine came up with better translations than me.
English and German are EU languages. Russian is not.
The EU maintains a large translation service to translate most EU official texts into all EU languages. So Google Translate is using that to train on. Google gets a free gift from a multinational bureaucracy and gets to look like a smart company in the process.
This is also why English-Mandarin is often poorly translated, in my opinion.
I am a professional translator, and I have been using LLMs to speed up and, yes, improve my translations for a year and a half.
When properly prompted, the LLMs produce reasonably accurate and natural translations, but sometimes there are mistakes (often the result of ambiguities in the source text) or the sentences don’t flow together as smoothly as I would like. So I check and polish the translations sentence by sentence. While I’m doing that, I sometimes encounter a word or phrase that just doesn’t sound right to me but that I can’t think how to fix. In those cases, I give the LLMs the original and draft translation and ask for ten variations of the problematic sentence. Most of the suggestions wouldn’t work well, but there are usually two or three that I like and that are better than what I could come up with on my own.
Lately I have also been using LLMs as editors: I feed one the entire source text and the draft translation, and I ask for suggestions for corrections and improvements to the translation. I adopt the suggestions I like, and then I run the revised translation through another LLM with the same prompt. After five or six iterations, I do a final read-through of the translation to make sure everything is okay.
My guess is that using LLMs like this cuts my total translation time by close to half while raising the quality of the finished product by some significant but difficult-to-quantify amount.
This process became feasible only after ChatGPT, Claude, and Gemini got longer context windows. Each new model release has performed better than the previous one, too. I’ve also tried open-weight models, but they were significantly worse for Japanese to English, the direction I translate.
Although I am not a software developer, I’ve been following the debates on HN about whether or not LLMs are useful as coding assistants with much interest. My guess is that the disagreements are due partly to the different work situations of the people on both sides of the issue. But I also wonder if some of those who reject AI assistance just haven’t been able to find a suitable interactive workflow for using it.
> While I’m doing that, I sometimes encounter a word or phrase that just doesn’t sound right to me but that I can’t think how to fix. In those cases, I give the LLMs the original and draft translation and ask for ten variations of the problematic sentence. Most of the suggestions wouldn’t work well, but there are usually two or three that I like and that are better than what I could come up with on my own.
Yes, coming up with variations that work better (and hit the right connotations) is what I used the machine for, too.
You don't even have to go back in time or use a comparatively rare language pair such as English/Russian.
Google Reviews insists on auto translating reviews to your native language (thankfully, the original review can be recovered by clicking a link). Even for English->German (probably one of the most common language pairs), the translations are usually so poor that you can't always tell what the review is trying to say. To be fair, I think that state of the art machine translation is better than whatever the hell Google is using here (google translate probably), but apparently product people at Google don't care enough to make translations better (or better, allow you to turn off this feature).
> Russian has a lot of ways you can write "compactly" with fewer words than english and have a much more precise meaning of the sentence. (I always likened russian to type-safe haskell and english to dynamic python)
The difference is in fusionality. English does not use inflection and relies heavily on auxiliaries (pre-, post-positions, particles, other modifiers) while Russian (and other Slavic, Baltic languages) rely rather heavily on inflection.
For English speakers, probably the closest is the gerund. A simple suffix transforms a verb into a noun-compatible form, denoting process. In highly fusional languages a root can be combined with multiple modifying pre-, a-, suf-fixes, and inflected on top. This does unlock some subtlety.
This whole thing of using LLMs to Code reminds me a bit of when Google Translate came out and became popular, right around the time I started studying Russian.
Yes, copying and pasting a block of Russian text produced a block of english text that you could get a general idea of what was happening. But translating from english to russian rarely worked well enough to fool the professor because of all the idioms, style, etc. Russian has a lot of ways you can write "compactly" with fewer words than english and have a much more precise meaning of the sentence. (I always likened russian to type-safe haskell and english to dynamic python)
If you actually understood Russian and read the text, you could uncover much deeper and subtle meaning and connections that get lost in translation.
If you went to russia today you could get around with google translate and people would understand you. But you aren't going to be having anything other than surface level requests and responses.
Coding with LLMs reminds me a lot of this. Yes, they produce something that the computer understands and runs, but the meaning and intention of what you wanted to communicate gets lost through this translation layer.
Coding is even worse because i feel like the intention of coding should never to be to belt out as many lines as possible. Coding has powerful abstractions that you can use to minimize the lines you write and crystalize meaning and intent.