Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It all sounds somewhat impressive (300k lines written and maintained by AI) but it's hard to judge how well the experience transfers without seeing the code and understanding the feature set.

For example, I have some code which is a series of integrations with APIs and some data entry and web UI controls. AI does a great job, it's all pretty shallow. The more known the APIs, the better able AI is to fly through that stuff.

I have other code which is well factored and a single class does a single thing and AI can make changes just fine.

I have another chunk of code, a query language, with a tokenizer, parser, syntax tree, some optimizations, and it eventually constructs SQL. Making changes requires a lot of thought from multiple angles and I could not safely give a vague prompt and expect good results. Common patterns need to fall into optimized paths, and new constructs need consideration about how they're going to perform, and how their syntax is going to interact with other syntax. You need awareness not just of the language but also the schema and how the database optimizes based on the data distribution. AI can tinker around the edges but I can't trust it to make any interesting changes.



AI agents to me seem maximalist by default, and if it takes 250 lines to meet a requirement they would choose that over the much simpler one-line change.

In an existing codebase this is easily solved by making your prompt more specific, possibly so specific you are just describing the actual changes to make. Even then I find myself asking for refinements that simplify the approach, with suggestions for what I know would work better. The only reason I'm not writing the change myself is because the AI agent is running a TDD-style red/green/refactor loop and can say stuff like "this is an integration test, don't use mocks and prefer to use relevant rspec matchers in favour of asserting on internal object state" and it will fix every test in the diff.

In a brand new codebase I don't have a baseline any more and it's just an AI adding more and more into a ball of mud. I'm doing this with NixOS and the only thing keeping it sane is that each generated file is quite small and simple (owing to the language being declarative). Yet still, I have zero idea if I can even deploy it yet as a result.


I stopped using AI assistance when I realized every time the agent made a change, I had to go behind and simplify it by about 80%. It's easier, faster, more fun, and produces a better end product if I just go ahead and do it myself.

If I am feeling lazy I can have one of the chats give me their shit solution to the micro problem at hand and extract the line I need and integrate it properly. This is usually a little faster than reading the manual, but it's wrong often enough that I usually read the manual for everything the bot does, to make sure. And so next time I can skip asking it.

Someone wake me up and tell me to try these tools again when the flow isn't prompting and deleting and repeat.

inb4 someone tells me there's a learning curve for this human language product that is supposed to make it so I'm obsolete and my CEO can do my job because it makes coding so easy that even an experienced coder has to climb a steep learning curve but there's going to be a white collar blood bath also

fucking pick a narrative, AI shills


On the contrary, this doesn’t sound impressive at all. It sounds like a cowboy coder working on relatively small projects.

300k LOC is not particularly large, and this person’s writing and thinking (and stated workflow) is so scattered that I’m basically 100% certain that it’s a mess. I’m using all of the same models, the same tools, etc., and (importantly) reading all of the code, and I have 0% faith in any of these models to operate autonomously. Also, my opinion on the quality of GPT-5 vs Claude vs other models is wildly different.

There’s a huge disconnect between my own experience and what this person claims to be doing, and I strongly suspect that the difference is that I’m paying attention and routinely disgusted by what I see.


300k especially isn’t impressive if it should have been 10k.


Yes, well put. And that’s a common failure mode.


I would guess that roughly 0.000087% devs on the planet do it in 10k (if it is possible) and 37.76% would do it in 876k so 300k is probably right in some middle :)


To be fair, codebases are bimodal, and 300k is large for the smaller part of the distribution. Large enterprise codebases tend to be monorepos, have a ton of generated code and a lot of duplicated functionality for different environments, so the 10-100 million line claims need to be taken with a grain of salt, a lot of the sub projects in them are well below 300k even if you pull in defs.


I'm fairly skeptical of the LLM craze but I deeply respect Peter Steinberger's work over the years, he truly is a gifted software developer in his own right. I'm sure his personal expertise helps him guide these tools better than many could.


(OP) 1/3rd of the code is tests.

There's an Expo app, two Tauri apps, a cli, a chrome extension. The admin part to help debug and test features is EXTREMELY detailed and around 40k LOC alone.

To give some perspective to that number.


Yeah, I read the post. Telling me that there's a chrome extension and some apps tells me nothing. Saying that the code is 1/3 tests is...something, but it's not exceptional, by any means.

I've got an code base I've been writing from scratch with LLMs, its of equivalent LOC and testing ratio, and my experiences trusting the models couldn't be more different. They routinely emit hot garbage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: