Architects went from drawing everything on paper, to using CAD products over a generation. That's a lot of years! They're still called architects.
Our tooling just had a refresh in less than 3 years and it leaves heads spinning. People are confused, fighting for or against it. Torn even between 2025 to 2026. I know I was.
People need a way to describe it from 'agentic coding' to 'vibe coding' to 'modern AI assisted stack'.
We don't call architects 'vibe architects' even though they copy-paste 4/5th of your next house and use a library of things in their work!
We don't call builders 'vibe builders' for using earth-moving machines instead of a shovel...
When was the last time you reviewed the machine code produced by a compiler? ...
The real issue this industry is facing, is the phenomenal speed of change. But what are we really doing? That's right, programming.
"When was the last time you reviewed the machine code produced by a compiler?"
Compilers will produce working output given working input literally 100% of my time in my career. I've never personally found a compiler bug.
Meanwhile AI can't be trusted to give me a recipe for potato soup. That is to say, I would under no circumstances blindly follow the output of an LLM I asked to make soup. While I have, every day of my life, gladly sent all of the compiler output to the CPU without ever checking it.
The compiler metaphor is simply incorrect and people trying to say LLMs compile English into code insult compiler devs and English speakers alike.
> Compilers will produce working output given working input literally 100% of my time in my career.
In my experience this isn't true. People just assume their code is wrong and mess with it until they inadvertently do something that works around the bug. I've personally reported 17 bugs in GCC over the last 2 years and there are currently 1241 open wrong-code bugs.
These are still deterministic bugs, which is the point the OP was making. They can be found and solved once. Most of those bugs are simply not that important, so they never get attention.
LLMS on the other hand are non-deterministic and unpredictable and fuzzy by design. That makes them not ideal when trying to produce output which is provably correct - sure you can output and then laboriously check the output - some people find that useful, some are yet to find it useful.
It's a little like using Bitcoin to replace currencies - sure you can do that, but it includes design flaws which make it fundamentally unsuited to doing so. 10 years ago we had rabid defenders of these currencies telling us they would soon take over the global monetary system and replace it, nowadays, not so much.
Sure bitcoin is at least deterministic, but IMO (an that of many in the finance industry) it's solving entirely the wrong problem - in practice people want trust and identity in transactions much more than they want distributed and trustless.
In a similar way LLMs seem to me to be solving the wrong problem - an elegant and interesting solution, but a solution to the wrong problem (how can I fool humans into thinking the bot is generally intelligent), rather than the right problem (how can I create a general intelligence with knowledge of the world). It's not clear to me we can jump from the first to the second.
Bitcoin transactions rely on mining to notarize, which is by design (due to the nature of the proof-of-work system) incredibly non-deterministic.
So when you submit a transaction, there is no hard and fast point in the future when it is "set in stone". Only a geometrically decreasing likelihood over time that a transaction might get overturned, improving by another geometric notch with every confirmed mined block that has notarized your transaction.
A lot of these design principles are compromises to help support an actually zero-trust ledger in contrast to the incumbent centralized-trust banking system, but they definitely disqualify bitcoin transactions as "deterministic" by any stretch of the imagination. They have quite a bit more in common with LLM text generation than one might have otherwise thought.
> I've personally reported 17 bugs in GCC over the last 2 years
You are an extreme outlier. I know about two dozen people who work with C(++) and not a single one of them has ever told me that they've found a compiler bug when we've talked about coding and debugging - it's been exclusively them describing PEBCAK.
I've been using c++ for over 30 years. 20-30 years ago I was mostly using MSVC (including version 6), and it absolutely had bugs, sometimes in handling the language spec correctly and sometimes regarding code generation.
Today, I use gcc and clang. I would say that compiler bugs are not common in released versions of those (i.e. not alpha or beta), but they do still occur. Although I will say I don't recall the last time I came across a code generation bug.
I knew one person reporting gcc bugs, and iirc those were all niche scenarios where it generated slightly suboptimal machine code but not otherwise observable from behavior
Right - I'm not saying that it doesn't happen, but that it's highly unusual for the majority of C(++) developers, and that some bugs are "just" suboptimal code generation (as opposed to functional correctness, which the GP was arguing).
I'm not arguing that LLMs are at a point today where we can blindly trust their outputs in most applications, I just don't think that 100% correct output is necessarily a requirement for that. What it needs to be is correct often enough that the cost of reviewing the output far outweighs the average cost of any errors in the output, just like with a compiler.
This even applies to human written code and human mistakes, as the expected cost of errors goes up we spend more time on having multiple people review the code and we worry more about carefully designing tests.
If natural language is used to specify work to the LLM, how can the output ever be trusted? You'll always need to make sure the program does what you want, rather than what you said.
>"You'll always need to make sure the program does what you want, rather than what you said."
Yes, making sure the program does what you want. Which is already part of the existing software development life cycle. Just as using natural language to specify work already is: It's where things start and return to over and over throughout any project. Further: LLM's frequently understand what I want better than other developers. Sure, lots of times they don't. But they're a lot better at it than they were 6 months ago, and a year ago they barely did so at all save for scripts of a few dozen lines.
That's exactly my point, it's a nice tool in the toolbox, but for most tasks it's not fire-and-forget. You still have to do all the same verification you'd need to do with human written code.
Just create a very specific and very detailed prompt that is so specific that it starts including instructions and you came up with the most expensive programming language.
You trust your natural language instructions thousand times a day. If you ask for a large black coffee, you can trust that is more or less what you’ll get. Occasionally you may get something so atrocious that you don’t dare to drink, but generally speaking you trust the coffee shop knows what you want. It you insist on a specific amount of coffee brewed at a specific temperature, however, you need tools to measure.
AI tools are similar. You can trust them because they are good enough, and you need a way (testing) to make sure what is produced meet your specific requirements. Of course they may fail for you, doesn’t mean they aren’t useful in other cases.
What’s to stop the barista putting sulphuric acid in your coffee? Well, mainly they don’t because they need a job and don’t want to go to prison. AIs don’t go to prison, so you’re hoping they won’t do it because you’ve promoted them well enough.
The person I'm replying to believes that there will be a point when you no longer need to test (or review) the output of LLMs, similar to how you don't think about the generated asm/bytecode/etc of a compiler.
That's what I disagree with - everything you said is obviously true, but I don't see how it's related to the discussion.
I don't necessarily think we'll ever reach that point and I'm pretty sure we'll never reach that point for some higher risk applications due to natural language being ambiguous.
There are however some applications where ambiguity is fine. For example, I might have a recipe website where I tell a LLM to "add a slider for the user to scale the number of servings". There's a ton of ambiguity there but if you don't care about the exact details then I can see a future where LLMs do something reasonable 99.9999% of the time and no one does more than glance at it and say it looks fine.
How long it is until we reach that point and if we'll ever reach that point is of course still up for debate, but I dnt think it's completely unrealistic.
The challenge not addressed with this line of reasoning is the required sheer scale of output validation on the backend of LLM-generated code. Human hand-developed code was no great shakes at the validation front either, but the scale difference hid this problem.
I’m hopeful what used to be tedious about the software development process (like correctness proving or documentation) becomes tractable enough with LLM’s to make the scale more manageable for us. That’s exciting to contemplate; think of the complexity categories we can feasibly challenge now!
Or the argument that "well, at some point we can come up with a prompt language that does exactly what you want and you just give it a detailed spec." A detailed spec is called code. It's the most round-about way to make a programming language that even then is still not deterministic at best.
Exactly the point. AI is absolutely BS that just gets peddled by shills.
It does not work. It might work for some JS bullcrao. But take existing code and ask it to add capsicum next to an ifdef of pledge. Watch the mayhem unfold.
This is obviously besides the point but I did blindly follow a wiener schnitzel recipe ChatGPT made me and cooked for a whole crew. It turned out great. I think I got lucky though, the next day I absolutely massacred the pancakes.
I genuinely admire your courage and willingness (or perhaps just chaos energy) to attempt both wiener schnitzel and pancakes for a crew, based on AI recipes, despite clearly limited knowledge of either.
Recent experiments with LLM recipes (ChatGPT): missed salt in a recipe to make rice, then flubbed whether that type of rice was recommended to be washed in the recipe it was supposedly summarizing (and lied about it, too)…
Probabilistic generation will be weighted towards the means in the training data. Do I want my code looking like most code most of the time in a world full of Node.js and PHP? Am I better served by rapid delivery from a non-learning algorithm that requires eternal vigilance and critical re-evaluation or with slower delivery with a single review filtered through an meatspace actor who will build out trustable modules in a linear fashion with known failure modes already addressed by process (ie TDD, specs, integration & acceptance tests)?
I’m using LLMs a lot, but can’t shake the feeling that the TCO and total time shakes out worse than it feels as you go.
There was a guy a few months ago who found that telling the AI to do everything in a single PHP file actually produced significantly better results, i.e. it worked on the first try. Otherwise it defaulted to React, 1GB of node modules, and a site that wouldn't even load.
>Am I better served
For anything serious, I write the code "semi-interactively", i.e. I just prompt and verify small chunks of the program in rapid succession. That way I keep my mental model synced the whole time, I never have any catching up to do, and honestly it just feels good to stay in the driver's seat.
Pro-tip: Do NOT use LLMs to generate recipes, use them to search the internet for a site with a trustworthy recipe, for information on cooking techniques, science, or chemistry, or if you need ideas about pairings and/or cooking theory / conventions. Do not trust anything an LLM says if it doesn't give a source, it seems people on the internet can't cook for shit and just make stuff up about food science and cooking (e.g. "searing seals in the moisture", though most people know this is nonsense now), so the training data here is utterly corrupt. You always need to inspect the sources.
I don't even see how an LLM (or frankly any recipe) that is a summary / condensation of various recipes can ever be good, because cooking isn't something where you can semantically condense or even mathematically combine various recipes together to get one good one. It just doesn't work like that, there is just one secret recipe that produces the best dish, and the way to find this secret recipe is by experimenting in the real world, not by trying to find some weighting of a bunch of different steps from a bunch of different recipes.
Plus, LLMs don't know how to judge quality of recipes at all (and indeed hallucinate total nonsense if they don't have search enabled).
If you have lots of experience from years of serious cooking, like I do, almost everything the LLM suggests or outputs re: cooking is false, bad or at best incredibly sub-par, and you will spend far more time correcting it and/or pushing it toward what you already know for it to be actually helpful / productive in getting you anything actually true. I also think it just messes up incredibly basic stuff all the time. I re-iterate it is only good for the things I said.
Whether or not you think you can get "good" recipes out of it will also depend on your experience with cuisine and cooking, and your own pickiness. I am sure amateurs or people who cook only occasionally can get use out of it, but it is not useful for me.
Cooking is a very different world from coding: recipes aren't composable like code (within-recipe ratios need to be maintained, i.e. recipes written in bakers ratios/proportions, steps are almost always sequentially dependent, and ingredients need to complement each other) and most sources besides the few good empirical ones actually verify anything they make, which is a problem, because the training data for cooking is far more poisoned.
I also cook daily at home, for fun (though I have catered a couple times for some large 50+ people family events too). Just, in my case, cooking is my passion, and has been more than just a minor hobby for me. I.e. there have been many years of my life where I spent 3-5 hours of every day cooking, and this has been the case for about 15 years now. If "professional home cook" was a thing, I'd be that, but, alas.
So my standards are admittedly probably a bit deranged relative to most...
Everything more complex than a hello-world has bugs. Compiler bugs are uncommon, but not that uncommon. (I must have debugged a few ICEs in my career, but luckily have had more skilled people to rely on when code generation itself was wrong.)
I had a fun bug while building a smartwatch app that was caused by the sample rate of the accelerometer increasing when the device heated up. I had code that was performing machine learning on the accelerometer data, which would mysteriously get less accurate during prolonged operation. It turned out that we gathered most of our training data during shorter runs when the device was cool, and when the device heated up during extended use, it changed the frequencies of the recorded signals enough to throw off our model.
I've also used a logic analyzer to debug communications protocols quite a few times in my career, and I've grown to rather like that sort of work, tedious as it may be.
Just this week I built a VFS using FUSE and managed to kernel panic my Mac a half-dozen times. Very fun debugging times.
I remember the time I spent hours debugging a feature that worked on Solaris and Windows but failed to produce the right results on SGI. Turns out the SGI C++ compiler silently ignored the `throw` keyword! Just didn’t emit an opcode at all! Or maybe it wrote a NOP.
All I’m saying is, compilers aren’t perfect.
I agree about determinism though. And I mitigate that concern by prompting AI assistants to write code that solves a problem, instead of just asking for a new and potentially different answer every time I execute the app.
> Meanwhile AI can't be trusted to give me a recipe for potato soup.
This just isn't true any more. Outside of work, my most common use case for LLMs is probably cooking. I used to frequently second guess them, but no longer - in my experience SOTA models are totally reliable for producing good recipes.
I recognize that at a higher level we're still talking about probabilistic recipe generation vs. deterministic compiler output, but at this point it's nonetheless just inaccurate to act as though LLMs can't be trusted with simple (e.g. potato soup recipe) tasks.
Just to nitpick - compilers (and, to some extent, processors) weren't deterministic a few decades ago. Getting them to be deterministic has been a monumental effort - see build reproducibility.
There's also no canonical way to write software, so in that sense generating code is more similar to coming up with a potato soup recipe than compiling code.
That is not the issue, any potato soup recipe would be fine, the issue is that it might fetch values from different recipes and give you an abomination.
This exactly, I cook as passion, and LLMs just routinely very clearly (weighted) "average" together different recipes to produce, in the worst case, disgusting monstrosities, or, in the best case, just a near-replica of some established site's recipe.
At least with the LLM, you don't have to wade through paragraph after paragraph of "I remember playing in the back yard as a child, I would get hungry..."
In fact LLMs write better and more interesting prose than the average recipe site.
It's not hard to scroll to the bottom of a page, IMO, but regardless, sites like you are mentioning have trash recipes in most cases.
I only go with resources where the text is actual documentation of their testing and/or the steps they've made, or other important details (e.g. SeriousEats, Whats Cooking America / America's Test Kitchen, AmazingRibs, Maangchi for Korean, vegrecipesofindia, Modernist series, etc) or look for someone with some credibility (e.g. Kenji Lopez, other chef on YouTube). In this case the text or surrounding content is valuable and should not be skipped. A plain recipe with no other details is generally only something an amateur would trust.
If you need a recipe, you don't know how to make it by definition, so you need more information to verify that the recipe is done soundly. There is also no reason to assume / trust that the LLMs summary / condensation of various recipes is good, because cooking isn't something where you can semantically condense or even mathematically combine various recipes together to get one good one. It just doesn't work like that, there is just one secret recipe that produces the best dish, and LLMs don't know how to judge quality of recipes, mostly.
I've never had an LLM produce something better or more trustworthy than any of those sites I mentioned, and have had it just make shit up when dealing with anything complicated (i.e. when trying to find the optimal ratio of starch to flour for Korean fried chicken, it just confidently claimed 50/50 is best, when this is obviously total trash to anyone who has done this).
The only time I've ever found LLMs useful for cooking is when I need to cook something obscure that only has information in a foreign language (e.g. icefish / noodlefish), or when I need to use it for search about something involving chemistry or technique (it once quickly found me a paper proving that baking soda can indeed be used to tenderize squid - but only after I prompted it further to get sources and go beyond its training data, because it first hallucinated some bullshit about baking soda only working on collagen or something, which is just not true at all).
So I would still never trust or use the quantities it gives me for any kind of cooking / dish without checking or having the sources, instead I would rely on my own knowledge and intuitions. This makes LLMs useless for recipes in about 99% of cases.
The difference is that I have a few sites and resources I already know that are NOT useless. With an LLM output, I have to check and verify every time, and, since LLMs are based on junk, almost always produce junk. But with the trusted sites, I do not have to check, and almost always get something decent and/or close to authentic!
The difference is between a trusted source that is good most of the time, vs. an LLM recipe that is trash 99% of the time.
EDIT: If you haven't visited any / all of the sites / sources I mentioned, check them out! They are really good, especially SeriousEats if the recipe is from Kenji Lopez. Maybe just avoid AmazingRibs, unless you have uBlock installed: they were way ahead of their time, but haven't updated in forever, and clearly have become desperate...
I think things can only be called revolutions in hindsight - while they are going on it's hard to tell if they are a true revolution, an evolution or a dead-end. So I think it's a little premature to call Generative AI a revolution.
AI will get there and replace humans at many tasks, machine learning already has, I'm not completely sure that generative AI will be the route we take, it is certainly superficially convincing, but those three years have not in fact seen huge progress IMO - huge amounts of churn and marketing versions yes, but not huge amounts of concrete progress or upheaval. Lots of money has been spent for sure! It is telling for me that many of the real founders at OpenAI stepped away - and I don't think that's just Altman, they're skeptical of the current approach.
What I don't understand about these arguments is that the input to the LLMs is natural language, which is inherently ambiguous. At which point, what does it even mean for an LLM to be reliable?
And if you start feeding an unambiguous, formal language to an LLM, couldn't you just write a compiler for that language instead of having the LLM interpret it?
Compilers are deterministic (modulo bugs), but most things in life are not, but can still be reliable.
The opposite also holds: "npm install && npm run build" can work today and fail in a year (due to ecosystem churn) even though every single component in that chain is deterministic.
2) Reliability is a continuum, not a discreet yes/no. In practice, we want things to be reliable enough (where "enough" is determined per domain).
I don't presume this will immediately change your mind, but hopefully will open your eyes to looking at this a bit differently.
> I don't presume this will immediately change your mind
I'm not saying that AI isn't useful. I'm just claiming it's not analogous to a compiler. If it was, you would treat your prompts as source code, and check them into source control. Checking the output of an LLM into source control is analogous to committing the machine code output from a compiler into source control.
My question still stands though. What does it mean for a tool to be reliable when the input language is ambiguous? This isn't just about the LLM being nondeterministic. At some point those ambiguities need to be resolved, either by the prompter, or the LLM. But the resolution to those ambiguities doesn't exist in the original input.
> We don't call architects 'vibe architects' even though they copy-paste 4/5th of your next house and use a library of things in their work!
> We don't call builders 'vibe builders' for using earth-moving machines instead of a shovel...
> When was the last time you reviewed the machine code produced by a compiler?
Sure, because those are categorically different. You are describing shortcuts of two classes: boilerplate (library of things) and (deterministic/intentional) automation. Vibe coding doesn't use either of those things. The LLM agents involved might use them, but the vibe coder doesn't.
Vibe coding is delegation, which is a completely different class of shortcut or "tool" use. If an architect delegates all their work to interns, directs outcomes based on whims not principals, and doesn't actually know what the interns are delivering, yeah, I think it would be fair to call them a vibe architect.
We didn't have that term before, so we usually just call those people "arrogant pricks" or "terrible bosses". I'm not super familiar but I feel like Steve Jobs was pretty famously that way - thus if he was an engineer, he was a vibe engineer. But don't let this last point detract from the message, which is that you're describing things which are not really even similar to vibe coding.
I do not see LLM coding as another step up on the ladder of programming abstraction.
If your project is in, say, Python, then by using LLMs, you are not writing software in English; you are having an LLM write software for you in Python.
This is much more like delegation of work to someone else, than it is another layer in the machine-code/assembly/C/Python sort of hierarchy.
In my regular day job, I am a project manager. I find LLM coding to be effectively project management. As a project manager, I am free to dive down to whatever level of technical detail I want, but by and large, it is others on the team who actually write the software. If I assign a task, I don't say "I wrote that code", because I didn't; someone else did, even if I directed it.
And then, project management, delegating to the team, is most certainly nondeterministic behavior. Any programmer on the team might come up with a different solution, each of which works. The same programmer might come up with more than one solutions, all of which work.
I don't expect the programmers to be deterministic. I do expect the compiler to be deterministic.
I think you are right in placing emphasis on delegation.
There’s been a hypothesis floating around that I find appealing. Seemingly you can identify two distinct groups of experienced engineers. Manager, delegator, or team lead style senior engineers are broadly pro-AI. The craftsman, wizard, artist, IC style senior engineers are broadly anti-AI.
But coming back to architects, or most professional services and academia to be honest, I do think the term vibe architect as you define it is exactly how the industry works. An underclass of underpaid interns and juniors do the work, hoping to climb higher and position themselves towards the top of the ponzi-like pyramid scheme.
Totally on point, except I'm pretty sure Jobs was not like that. From what I've read he'd be more of a hands on "agentic engineer". Baby-sitting his engineers and designers and steering them.
> We don't call architects 'vibe architects' even though they copy-paste 4/5th of your next house and use a library of things in their work!
Architect's copy-pasting is equivalent to a software developer reusing a tried and tested code library. Generating or writing new code is fundamentally different and not at all comparable.
> We don't call builders 'vibe builders' for using earth-moving machines instead of a shovel...
We would call them "vibe builders" if their machines threw bricks around randomly and the builders focused all of their time on engineering complex scaffolding around the machines to get the bricks flying roughly in the right direction.
But we don't because their machines, like our compilers and linters, do one job and they do it predictably. Most trades spend obscene amounts of money on tools that produce repeatable results.
> That's a lot of years! They're still called architects.
Because they still architect, they don't subcontract their core duties to architecture students overseas and just sign their name under it.
I find it fitting and amusing that people who are uncritical towards the quality of LLM-generated work seem to make the same sorts of reasoning errors that LLMs do. Something about blind spots?
Very likely, yes. One day we'll have a clearer understanding of how minds generalize concepts into well-trodden paths even when they're erroneous, and it'll probably shed a lot of light onto concepts like addiction.
Don't take this as criticizing LLMs as a whole, but architects also don't call themselves engineers. Engineers are an entirely distinct set of roles that among other things validate the plan in its totality, not only the "new" 1/5th. Our job spans both of these.
"Architect" is actually a whole career progression of people with different responsibilities. The bottom rung used to be the draftsmen, people usually without formal education who did the actual drawing. Then you had the juniors, mid-levels, seniors, principals, and partners who each oversaw different aspects. The architects with their name on the building were already issuing high level guidance before the transition instead of doing their own drawings.
When was the last time you reviewed the machine code produced by a compiler?
Last week, to sanity check some code written by an LLM.
> Engineers are an entirely distinct set of roles that among other things validate the plan in its totality, not only the "new" 1/5th. Our job spans both of these.
Where this analogy breaks down is that the work you’re describing is done by Professional Engineers that have strict licensing and are (criminally) liable for the end result of the plans they approve.
That is an entirely different role from the army of civil, mechanical, and electrical engineers (some who are PEs and some who are not) who do most of the work for the principal engineer/designated engineer/engineer of record, that have to trust building codes and tools like FEA/FEM that then get final approval from the most senior PE. I don’t think the analogy works, as software engineers rarely report to that kind of hierarchy. Architects of Record on construction projects are usually licensed with their own licensing organization too, with layers of licensed and unlicensed people working for them.
That diversity of roles is what "among other things" was meant to convey. My job at least isn't terribly different, except that licensing doesn't exist and I don't get an actual stamp. My company (and possibly me depending on the facts of the situation) is simply liable if I do something egregious that results in someone being hurt.
> Where this analogy breaks down is that the work you’re describing is done by Professional Engineers that have strict licensing and are (criminally) liable for the end result of the plans they approve.
there are plenty of software engineers that work in regulated industries, with individual licensing, criminal liability, and the ability to be struck off and banned from the industry by the regulator
It's not that PE's can't design or review buildings in whatever city the egregious failure happened.
It's that PE's can't design or review buildings at all in any city after an egregious failure.
It's not that PE's can't design or review hospital building designs because one of their hospital designs went so egregiously sideways.
It's that PE's can't design or review any building for any use because their design went so egregiously sideways.
I work in an FDA regulated software area. I need 510k approval and the whole nine. But if I can't write regulated medical or dental software anymore, I just pay my fine and/or serve my punishment and go sling React/JS/web crap or become a TF/PyTorch monkey. No one stops me. Consequences for me messing up are far less severe than the consequences for a PE messing up. I can still write software because, in the end, I was never an "engineer" in that hard sense of the word.
Same is true of any software developer. Or any unlicensed area of "engineering" for that matter. We're only playing at being "engineers" with the proverbial "monopoly money". We lose? Well, no real biggie.
PE's agree to hang a sword of damocles over their own heads for the lifetime of the bridge or building they design. That's a whole different ball game.
>if I approve a bad release that leads to an egregious failure, for me it's a prison sentence and unlimited fines
Again, I'm in 510k land. The same applies to myself. No one's gonna allow me to irradiate a patient with a 10x dose because my bass ackwards software messed up scientific notation. To remove the wrong kidney because I can't convert orthonormal basis vectors correctly.
But the fact remains that no one would stop either of us from writing software in the future in some other domain.
They do stop PE's from designing buildings in the future in any other domain. By law. So it's very much a different ball game. After an egregious error, we can still practice our craft, because we aren't "engineers" at the end of the day. (Again, "engineers" in that hard sense of the word.) PE's can't practice their craft any longer after an egregious error. Because they are "engineers" in that hard sense of the word.
Reasoning by analogy is usually a bad idea, and nowhere is this worse than talking about software development.
It’s just not analogous to architecture, or cooking, or engineering. Software development is just its own thing. So you can’t use analogy to get yourself anywhere with a hint of rigour.
The problem is, AI is generating code that may be buggy, insecure, and unmaintainable. We have as a community spent decades trying to avoid producing that kind of code. And now we are being told that productivity gains mean we should abandon those goals and accept poor quality, as evidenced by MoltBook’s security problems.
It’s a weird cognitive dissonance and it’s still not clear how this gets resolved.
Now then, Moltbook is a pathological case. Either it remains a pathological case or our whole technological world is gonna stumble HARD as all the fundamental things collapse.
I prefer to think Moltbook is a pathological case and unrepresentative, but I've also been rethinking a sort of game idea from computer-based to entirely paper/card based (tariffs be damned) specifically for this reason. I wish to make things that people will have even in the event that all these nice blinky screens are ruined and go dark.
Just the first system that was coded by AI could think of. Note this is unrelated to the fact that its users are LLMs - the problem was in the development of Moltbook itself.
It's not about the tooling it's about the reasoning. An architect copy pasting existing blueprints is still in charge and has to decide what the copy paste and where. Same as programmer slapping a bunch of code together, plumbing libraries or writing fresh code. They are the ones who drive the logical reasoning and the building process.
The ai tooling reverses this where the thinking is outsourced to the machine and the user is borderline nothing more than a spectator, an observer and a rubber stamp on top.
Anyone who is in this position seriously need to think their value added. How do they plan to justify their position and salary to the capital class. If the machine is doing the work for you, why would anyone pay you as much as they do when they can just replace you with someone cheaper, ideally with no-one for maximum profit.
Everyone is now in a competition not only against each other but also against the machine. And any specialized. Expert knowledge moat that you've built over decades of hard work is about to evaporate.
This is the real pressing issue.
And the only way you can justify your value added, your position, your salary is to be able to undermine the AI, find flaws in it's output and reasoning. After all if/when it becomes flawless you have no purpose to the capital class!
> The ai tooling reverses this where the thinking is outsourced to the machine and the user is borderline nothing more than a spectator, an observer and a rubber stamp on top.
I find it a bit rare that this is the case though. Usually I have to carefully review what it's doing and guide it. Either by specific suggestions, or by specific tests, etc. I treat it as a "code writer" that doesn't necessarily understand the big picture. So I expect it to fuck up, and correcting it feels far less frustrating if you consider it a tool you are driving rather than letting it drive you. It's great when it gets things right but even then it's you that is confirming this.
This is exactly what I said in the end. Right now you rely on it fucking things up. What happens to you when the AI no longer fucks things up? Sorry to say, but your position is no longer needed.
Architects went from drawing everything on paper to using CAD, not over a generation, but over a few years, after CAD and computers got good enough.
It therefore depends on where we place the discovery/availability of the product. If we place it at the time of prototype production (in the early 1960s for CAD), it took a generation (20-30 years), since by the early and mid-1990s, all professionals were already using CAD.
But if we place it at the time when CAD and personal computers became available to the general public (e.g., mid-1980s), it took no more than 5-10 years. I attended a technical school in the 1990s, and we started with hand drawing in the first two years and used CAD systems in the remaining three years of school.
The same can be said for AI. If we place the beginning of AI in the mid-1980s, the wider adoption of AI took more than a generation. If we place it at the time OpenAI developed GPT, it took 5-10 years.
> We don't call architects 'vibe architects' even though they copy-paste 4/5th of your next house and use a library of things in their work!
Maybe not, but we don't allow non-architects to vomit out thousands of diagrams that they cannot review, and that is never reviewed, which are subsequently used in the construction of the house.
Your analogy to s/ware is fatally and irredeemably flawed, because you are comparing the regulated and certification-heavy production of content, which is subsequently double-checked by certified professionals, with an unregulated and non-certified production of content which is never checked by any human.
I don't see a flaw, I think you're just gatekeeping software creation.
Anyone can pick up some CAD software and design a house if they so desire. Is the town going to let you build it without a certified engineer/architect signing off? Fuck no. But we don't lock down CAD software.
And presumably, mission critical software is still going to be stamped off on by a certified engineer of some sort.
> Anyone can pick up some CAD software and design a house if they so desire. Is the town going to let you build it without a certified engineer/architect signing off? Fuck no. But we don't lock down CAD software.
No, we lock down using that output from the CAD software in the real world.
> And presumably, mission critical software is still going to be stamped off on by a certified engineer of some sort.
The "mission critical" qualifier is new to your analogy, but is irrelevant anyway - the analogy breaks because, while you can do what you like with CAD software on your own PC, that output never gets used outside of your PC without careful and multiple levels of review, while in the s/ware case, there is no review.
I am not really sure what you are getting at here. Are you suggesting that people should need to acquire some sort of credential to be allowed to code?
> Are you suggesting that people should need to acquire some sort of credential to be allowed to code?
No, I am saying that you are comparing professional $FOO practitioners to professional $BAR practitioners, but it's not a valid comparison because one of those has review and safety built into the process, and the other does not.
You can't use the assertion "We currently allow $FOO practitioners to use every single bit of automation" as evidence that "We should also allow $BAR practitioners to use every bit of automation", because $FOO output gets review by certified humans, and $BAR output does not.
Thanks brother. I flew half way around the world yesterday and am jetlagged as fuck from a 12 hour time change. I'm sorry, my brain apparently shut off, but I follow now. Was out to lunch.
> Thanks brother. I flew half way around the world yesterday and am jetlagged as fuck from a 12 hour time change. I'm sorry, my brain apparently shut off, but I follow now. Was out to lunch.
You know, this was a very civilised discussion; below I've got someone throwing snide remarks my way for some claims I made. You just factually reconfirmed and re-verified until I clarified my PoV.
> We don't call architects 'vibe architects' even though (…)
> We don't call builders 'vibe builders' for (…)
> When was the last time (…)
None of those are the same thing. At all. They are still all deterministic approaches. The architect’s library of things doesn’t change every time they use it or present different things depending on how they hold it. It’s useful because it’s predictable. Same for all your other examples.
If we want to have an honest discussion about the pros and cons of LLM-generated code, proponents need to stop being dishonest in their comparisons. They also need to stop plugging their ears and not ignore the other issues around the technology. It is possible to have something which is useful but whose advantages do not outweigh the disadvantages.
I think the word predictable is doing a bit of heavy lifting there.
Lets say you shovel some dirt, you’ve got a lot of control over where you get it from and where you put it..
Now get in your big digger’s cabin and try to have the same precision. At the level of a shovel-user, you are unpredictable even if you’re skilled. Some of your work might be out a decent fraction of the width of a shovel. That’d never happen if you did it the precise way!
But you have a ton more leverage. And that’s the game-changer.
That’s another dishonest comparison. Predictability is not the same as precision. You don’t need to be millimetric when shovelling dirt at a construction site. But you do need to do it when conducting brain surgery. Context matters.
Sure. If you’re racing your runway to go from 0 to 100 users you’d reach for a different set of tools than if you’re contributing to postgres.
In other words I agree completely with you but these new tools open up new possibilities. We have historically not had super-shovels so we’ve had to shovel all the things no matter how giant or important they are.
I’m not disputing that. What I’m criticising is the argument from my original parent post of comparing it to things which are fundamentally different, but making it look equivalent as a justification against criticism.
earth-moving machines don't really move earth because they are told so, machine requires fine-grained deterministic input in a certain language of motion to do what it does.
imagine if machine operator was saying "move this earth here, not not this one, 30 cantimeters to the left... yes, move this one earth here, no not here, please a bit to the right", lol
of course, someone with no exprerience with the machine would prefer telling it what to do, and he will even achieve something rough, but someone who knows what handles and pedals do would prefer the handles and pedals
It's amusing how everyone seems to be going through the same journey.
I do run multiple models at once now. On different parts of the code base.
I focus solely on the less boring tasks for myself and outsource all of the slam dunk and then review. Often use another model to validate the previous models work while doing so myself.
I do git reset still quite often but I find more ways to not get to that point by knowing the tools better and better.
In 2022, I interviewed with a company... in crypto.
I was the oldest in the company by a decade at least. They kept telling me they wanted experience. I have plenty, of experience. I was cautiously optimistic.
They eventually failed me on a test of reactJS. The funniest part was when I asked for feedback, the reason they gave me, were showing poor engineering technique on their end; a lack of understanding of what makes it down the wire.
So they wanted experience, but not the experience that prevents them from making mistakes of their own; not an experience that threatened their views. I realised this later. Young rock-star developers want experienced people around them, maybe, but they want to be free to reinvent the wheel on a whim.
Now when I interview some place and I eerily feel old, I just bow out respectfully. No point wasting everyone's time.
My automatic “red flag” was btree tests. As soon as I saw one of those, I knew I was wasting my time.
I was especially annoyed by recruiters that couldn’t do math. They loved all my experience, but ghosted me, as soon as they realized it came with gray hair. I guess the place is crawling with 35-year-olds with 30 years of experience.
As it turned out, I ended up giving up, and just retiring. I had the means, but wanted to keep working for at least another decade. I really enjoyed adding value. I was especially interested in helping small companies get on their feet, as my particular skillset would have been almost ideal for that, and my “nest egg” gave me a pretty good risk tolerance, along with a willingness to take a lower base.
Turns out that these were the exact companies that didn’t want me, though.
Also turned out that I really loved being retired. I have been doing more work in the last eight years, than in a couple of decades previously. I just don’t get paid for it, and I’m fine with that. In fact, I actively resist pursuing a paycheck, as I don’t want to deal with knuckleheads, anymore.
I just had to have my hand forced. I would not have voluntarily done this.
Basically, any test that involves binary trees (sorry - "btree" is a somewhat different thing).
Realistically, most programmers never see another binary tree, after they leave school.
It's a "youth-pass filter." People right out of college will ace them. Us oldsters are less likely to do as well (unless we cram for them). In forty years of programming, I never encountered a single one, in the wild, and a lot of our image processing algorithms involved a decent amount of data crawling, so they had some relation to binary trees (shows why they teach them), but the way they were handled was much different.
It's easy to get bitter about these things.
"Experience" seems to be code for: "we've spent fifteen years painting ourself into a corner and now we need a guy who will get us out of it in three months or less".
You are however not allowed to give any feedback whatsoever about their processes, priorities, organization, promotion strategies, retention policies, etc.
Having experience usually means that you've acquired a holistic view of software development. Usually the hard way.
But they want solutions, not advice or opinions.
I've met a few devs that makes a living like that. Get in, solve problems. Keep quiet. Get out. Wait for them to call back in a couple of years.
> You are however not allowed to give any feedback whatsoever about their processes, priorities, organization, promotion strategies, retention policies, etc.
Ironically, the only people who have social permission to do that are extremely expensive Big Name outside consultants. Who will then do one of two things: either speak to the staff, collate what they have to say, and launder it back to the boss; or produce a thinly veiled adaptation of whatever business book the CEO last read in an airport.
> speak to the staff, collate what they have to say, and launder it back to the boss
My wife is a management consultant and this is _exactly_ what she does in half of her projects. But it is a bit more sinister than that, the management consultant feed the info back to the _top_ bosses bypassing the middle-management hellscape.
For example, she did a project for a big bank where she interviewed 70 or so people her main output was a streamlined virtual machine requisition flow (which included merging a couple of teams together and configuring the ticketing system they already had). It used to take devs 6 months to get a VM. I bet the devs where yelling at their middle managers to sort it out, but their managers didn't have or want to actually bring it up with upper management with a plan on how to do it.
I joke that companies could just do that internally, have some people interviewing the leaf nodes in the org to find out top-down initiatives to help work get done, but companies simply don't do this.
This is a reason why when life pushed me away from product development into consulting/agency work, I hated it at first and eventually I learnt to appreciate the positive side of it.
Usually those kind of companies won't hire old employees, while at the same time will gladly pay for consulting knowledge to solve their problems.
Also while product companies tend to hire folks that the very last thing they worked on checks all bullet points on the HR job ad, agencies will gladly throw people at a problem regardless of the skills list, as long as the team learns to swimm fast enough.
> agencies will gladly throw people at a problem regardless of the skills list, as long as the team learns to swim fast enough.
I did a few years at a company which was "product development consultancy", and this aspect of it was really enjoyable. We got a set of diverse challenges through the door, often "virtual startups" (CEO hiring consultants rather than staff in order to do v1 of a product). The company was basically a single room, and we had two senior guys (the founders) to review work and support us. Plus one "smartest guy in the room" who served as mathematician fire-support for things like signal processing or the rare actual DS&A problem.
In fairness to Salesforce, it was the garbage third party apps in their ecosystem which got compromised and did the leaking, not Salesforce themselves.
Feels like some company who has zero trust in its developers, would roll out these. And that's your env.
I'm a bit out of the loop so I'm not sure this might already be a thing on AWS.
reply