More

gmueckl · 2026-02-12T07:38:51 1770881931

My experience is that all LLMs that I have tested so far did a very good job producing D code.

I actually think that the average D code produced has been superior to the code produced for the C++ problems I tested. This may be an outlier (the problems are quite different), but the quality issues I saw on the C++ side came partially from the ease in which the language enables incompatible use of different features to achieve similar goals (e.g. smart_ptr s new/delete).

gmueckl · 2026-02-06T04:00:46 1770350446

Just spoofing a DNS reply would be enough if it arrives first, wouldn't it?

gmueckl · 2026-02-05T19:43:54 1770320634

The result is hardly a clean room implementation. It was rather a brute force attempt to decompress fuzzily stored knowledge contained within the network and it required close steering (using a big suite of tests) to get a reasonable approximation to the desired output. The compression and storage happened during the LLM training.

Prove this statement wrong.

libraryofbabel · 2026-02-05T21:07:34 1770325654

Nobody disputes that the LLM was drawing on knowledge in its training data. Obviously it was! But you'll need to be a bit more specific with your critique, because there is a whole spectrum of interpretations, from "it just decompressed fuzzily-stored code verbatim from the internet" (obviously wrong, since the Rust-based C compiler it wrote doesn't exist on the internet) all the way to "it used general knowledge from its training about compiler architecture and x86 and the C language."

Your post is phrased like it's a two sentence slam-dunk refutation of Anthropic's claims. I don't think it is, and I'm not even clear on what you're claiming precisely except that LLMs use knowledge acquired during training, which we all agree on here.

nicoburns · 2026-02-05T22:55:43 1770332143

"clean room" usually means "without looking at the source code" of other similar projects. But presumably the AIs training data would have included GCC, Clang, and probably a dozen other C compilers.

signatoremo · 2026-02-05T23:49:43 1770335383

Suppose you the human are working on a clean room implementation of C compiler, how do you go about doing it? Will you need to know about: a) the C language, and b) the inner working of a compiler? How did you acquire that knowledge?

sarchertech · 2026-02-06T02:47:42 1770346062

Doesn’t matter how you gain general knowledge of compiler techniques as long as you don’t have specific knowledge of the implementation of the compiler you are reverse engineering.

If you have ever read the source code of the compiler you are reverse engineering, you are by definition not doing a clean room implementation.

pertymcpert · 2026-02-06T06:51:08 1770360668

Claude was not reverse engineering here. By your definition no one can do a clean room implementation if they've taken a recent compilers course at university.

sarchertech · 2026-02-06T12:16:19 1770380179

Claude was reverse engineering gcc. It was using it as an oracle and attempting to exactly march its output. That is the definition of reverse engineering. Since Claude was trained on the gcc source code, that’s not a clean room implementation.

> By your definition no one can do a clean room implementation if they've taken a recent compilers course at university.

Clean room implementation has a very specific definition. It’s not my definition. If your compiler course walked through the source code of a specific compiler then no you couldn’t build a clean room implementation of that specific compiler.

signatoremo · 2026-02-06T22:37:53 1770417473

There is no specific definition of clean room implementation. Please provide source for your claim otherwise.

There are many well known examples of clean room implementation. One example that survived lawsuits is Sony v. Connectix:

During production, Connectix unsuccessfully attempted a Chinese wall approach to reverse engineer the BIOS, so its engineers disassembled the object code directly. Connectix's successful appeal maintained that the direct disassembly and observation of proprietary code was necessary because there was no other way to determine its behavior - [0]

That practice is similar to GCC being used here to verify the output of the generated compiler, arguably even more intrusive.

[0] -https://en.wikipedia.org/wiki/Clean-room_design

sarchertech · 2026-02-07T00:41:56 1770424916

“clean room implementation” is a term of art with a specific meaning. It has no statutory definition though so you’re technically right. But it is a defense against copyright infringement because you can’t infringe on copyright without knowledge of the material.

>During production, Connectix unsuccessfully attempted a Chinese wall approach to reverse engineer the BIOS, so its engineers disassembled the object code directly.

This doesn’t mean what you think it means. They unsuccessfully attempted a clean room implementation. What they did do was later ruled to be fair use, but it wasn’t a clean room implementation.

Using gcc as an oracle isn’t what makes it not a clean room implementation. Prior knowledge of the source code is what makes it not a clean room implementation. Using gcc as an oracle makes it an attempt to reverse engineer gcc, it says nothing about whether it is a clean room implementation or not.

There is no definition of “clean room implementation” that allows knowledge of source code. Otherwise it’s not a clean room implementation. It’s just reverse engineering/copying.

signatoremo · 2026-02-07T05:46:00 1770443160

Again, reverse engineering is a valid use case of clean room implementation as I posted above, so you don't have a point there.

> “clean room implementation” is a term of art with a specific meaning.

What is the specific meaning you are talking about? If I set out to do a clean room implementation of some software, what do I need to do specifically so that I will prevail any copyright infringement claims? The answer is that there is no such a surefire guarantee.

Re: Sony v. Connectix, clean room is to protect against copyright infringement, and since Connectix was ruled not infringing on Sony's copyrights, their implementation is practically clean room under the law, despite all the pushbacks. If Connectix prevailed, I'm sure the C compiler in question would have prevailed as well if they got sued.

Finally, take Phoenix vs. IBM re: the former's BIOS implementation of the latter's PC:

Whenever Phoenix found parts of this new BIOS that didn't work like IBM's, the isolated programmer would be given written descriptions of the problems, but not any coded solutions that might have hinted at IBM's original version of the software - [0]

That very much sounds like using GCC as an online known-good compiler oracle to compare against in this case.

[0] - https://books.google.com/books?id=Bwng8NJ5fesC&pg=PA56#v=one...

sarchertech · 2026-02-07T09:25:43 1770456343

You’re getting confused because you are substituting the goal of a clean room implementation for its definition. And you are not understanding that “clean room implementation” is one specific type of reverse engineering.

The goal is to avoid copyright infringement claims. A specific clean room implementation may or may not be successful at that.

This does not mean that any reverse engineering attempt that successfully avoids copyright infringement was a clean room implementation.

A clean room implementation is a specific method of reverse engineering where one team writes a spec by reviewing the original software and the other team attempts to implement that spec. The entire point is so that the 2nd team has no knowledge of proprietary implementation details.

If the 2nd team has previously read the entire source code that defeats the entire purpose.

> That very much sounds like using GCC as an online known-good compiler oracle to compare against in this case.

Yes and that is absolutely fine to do in a clean room implementation. That’s not the part that makes this not a clean room implementation. That’s the part that makes it an attempt at reverse engineering.

pertymcpert · 2026-02-09T09:26:37 1770629197

Why do you say it reversed engineered gcc instead of llvm? If you read the code it has much more of llvm concepts than gcc.

sarchertech · 2026-02-09T22:03:21 1770674601

Because they used gcc output as a reference spec.

signatoremo · 2026-02-06T21:56:18 1770414978

> you are by definition not doing a clean room implementation.

This makes no sense. Reverse engineering IS an application of clean room implementation. Citing Wikipedia:

“Clean-room design (also known as the Chinese wall technique) is the method of copying a design by reverse engineering and then recreating it without infringing any of the copyrights associated with the original design”

https://en.wikipedia.org/wiki/Clean-room_design

sarchertech · 2026-02-07T09:30:20 1770456620

There are many ways to reverse engineer a piece of software.

A clean room implementation is one such method of reverse engineering.

A clean room implementation is always reverse engineering. Reverse engineering is not always done using a clean room method.

gmueckl · 2026-02-06T00:19:19 1770337159

The result is a fuzzy reproduction of the training input, specifically of the compilers contained within. The reproduction in a different, yet still similar enough programming language does not refute that. The implementation was strongly guided by a compiler and a suite of tests as an explicit filter on those outputs and limiting the acceptable solution space, which excluded unwanted interpolations of the training set that also result from the lossy input compression.

The fact that the implementation language for the compiler is rust doesn't factor into this. ML based natural language translation has proven that model training produces an abstract space of concepts internally that maps from and to different languages on the input and output side. All this points to is that there are different implicitly formed decoders for the same compressed data embedded in the LLM and the keyword rust in the input activates one specific to that programming language.

astrange · 2026-02-06T10:00:08 1770372008

> The result is a fuzzy reproduction of the training input, specifically of the compilers contained within.

Is it? I'm somewhat familiar with gcc and clang's source and it doesn't really particularly look like it to me.

https://github.com/anthropics/claudes-c-compiler/blob/main/s...

https://llvm.org/doxygen/LoopStrengthReduce_8cpp_source.html

https://github.com/gcc-mirror/gcc/blob/master/gcc/gimple-ssa...

gmueckl · 2026-02-06T15:49:51 1770392991

Checking for similarity with compilers that consist of orders of magnitudes more code probably doesn't reveal much. There many more smaller compilers for C-adjacent languages out there pkus cod3 fragments from text books.

astrange · 2026-02-06T17:36:31 1770399391

There are not many more compilers with the specific optimization pass I linked.

Also, I don't think you could reuse code from a different compiler unless you used the same IR.

libraryofbabel · 2026-02-06T01:00:07 1770339607

Thanks for elaborating. So what is the empirically-testable assertion behind this… that an LLM cannot create a (sufficiently complex) system without examples of the source code of similar systems in its training set? That seems empirically testable, although not for compilers without training a whole new model that excludes compiler source code from training. But what other kind of system would count for you?

gmueckl · 2026-02-06T15:52:58 1770393178

I personally work on simulation software and create novel simulation methods as part of the job. I find that LLMs can only help if I reduce that task to a translation of detailed algorithms descriptions from English to code. And even then, the output is often riddled with errors.

NitpickLawyer · 2026-02-05T19:47:12 1770320832

> Prove this statement wrong.

If all it takes is "trained on the Internet" and "decompress stored knowledge", then surely gpt3, 3.5, 4, 4.1, 4o, o1, o3, o4, 5, 5.1, 5.x should have been able to do it, right? Claude 2, 3, 4, 4.1, 4.5? Surely.

shakna · 2026-02-05T22:07:14 1770329234

Well, "Reimplement the c4 compiler - C in four functions" is absolutely something older models can do. Because most are trained, on that quite small product - its 20kb.

But reimplementing that isn't impressive, because its not a clean room implementation if you trained on that data, to make the model that regurgitates the effort.

signatoremo · 2026-02-05T23:52:55 1770335575

> Well, "Reimplement the c4 compiler - C in four functions" is absolutely something older models can do.

Are you sure about that? Do you have some examples? The older Claude models can’t do it according to TFA.

shakna · 2026-02-06T04:39:04 1770352744

Not ones I recorded. But something I threw at DeepSeek, early Claude, etc.

And the prompt was just that. Nothing detailed.

gmueckl · 2026-02-05T21:41:56 1770327716

This comparison is only meaningful with comparable numbers of parameters and context window tokens. And then it would mainly test the efficiency and accuracy of the information encoding. I would argue that this is the main improvement over all model generations.

geraneum · 2026-02-05T19:51:10 1770321070

Perhaps 4.5 could also do it? We don’t know really until we try. I don’t trust the marketing material as much. The fact that the previous version (smaller versions) couldn’t or could do it does not really disprove that claim.

hn_acc1 · 2026-02-05T21:50:35 1770328235

Are you really asking for "all the previous versions were implemented so poorly they couldn't even do this simple, basic LLM task"?

Philpax · 2026-02-05T22:55:38 1770332138

Please look at the source code and tell me how this is a "simple, basic LLM task".

Marha01 · 2026-02-05T20:09:11 1770322151

Even with 1 TB of weights (probable size of the largest state of the art models), the network is far too small to contain any significant part of the internet as compressed data, unless you really stretch the definition of data compression.

jesse__ · 2026-02-05T20:49:34 1770324574

This sounds very wrong to me.

Take the C4 training dataset for example. The uncompressed, uncleaned, size of the dataset is ~6TB, and contains an exhaustive English language scrape of the public internet from 2019. The cleaned (still uncompressed) dataset is significantly less than 1TB.

I could go on, but, I think it's already pretty obvious that 1TB is more than enough storage to represent a significant portion of the internet.

FeepingCreature · 2026-02-05T21:52:41 1770328361

This would imply that the English internet is not much bigger than 20x the English Wikipedia.

That seems implausible.

jesse__ · 2026-02-05T22:54:39 1770332079

> That seems implausible.

Why, exactly?

Refuting facts with "I doubt it, bro" isn't exactly a productive contribution to the conversation..

onraglanroad · 2026-02-06T17:51:49 1770400309

Because we can count? How could you possibly think that Wikipedia was 5% of the whole Internet? It's just such a bizarrely foolish idea.

kgeist · 2026-02-05T21:46:34 1770327994

A lot of the internet is duplicate data, low quality content, SEO spam etc. I wouldn't be surprised if 1 TB is a significant portion of the high-quality, information-dense part of the internet.

FeepingCreature · 2026-02-05T21:53:02 1770328382

I would be extremely surprised if it was that small.

artisin · 2026-02-06T14:52:55 1770389575

I was curious about the scale of 1TiB of text. According to WolframAlpha, it's roughly 1.1 trillion characters, which breaks down to 180.2 billion words, 360.5 million pages, or 16.2 billion lines. In terms of professional typing speed, that's about 3800 years of continuous work.

So post-deduplication, I think it's a fair assessment that a significant portion of high-quality text could fit within 1TiB. Tho 'high-quality' is a pretty squishy and subjective term.

FeepingCreature · 2026-02-09T09:14:48 1770628488

Yes, a million books is a reasonably big library.

But I would be surprised if the internet only filled a reasonably big library.

kaibee · 2026-02-06T04:31:02 1770352262

Well, a terabyte of text is... quite a lot of text.

gmueckl · 2026-02-05T21:34:32 1770327272

This is obviously wrong. There is a bunch of knowledge embedded in those weights, and some of it can be recalled verbatim. So, by virtue of this recall alone, training is a form of lossy data compression.

0xCMP · 2026-02-05T21:34:26 1770327266

I challenge anyone to try building a C compiler without a big suite of tests. Zig is the most recent attempt and they had an extensive test suite. I don't see how that is disqualifying.

If you're testing a model I think it's reasonable that "clean room" have an exception for the model itself. They kept it offline and gave it a sandbox to avoid letting it find the answers for itself.

Yes the compression and storage happened during the training. Before it still didn't work; now it does much better.

hn_acc1 · 2026-02-05T21:55:21 1770328521

The point is - for a NEW project, no one has an extensive test suite. And if an extensive test suite exists, it's probably because the product that uses it also exists, already.

If it could translate the C++ standard INTO an extensive test suite that actually captures most corner cases, and doesn't generate false positives - again, without internet access and without using gcc as an oracle, etc?

brutalc · 2026-02-05T19:50:27 1770321027

No one needs to prove you wrong. That’s just personal insecurity trying to justify ones own worth.

gmueckl · 2026-02-04T01:06:09 1770167169

How? The printer only ever retrieves G code for individual parts without any knowledge of what they are going to be assembled into. There is no viable way to solve this classification problem on this kind of incomplete data, is there?

ssl-3 · 2026-02-04T20:46:51 1770238011

That's broadly how it works today, yes: The printer itself has no concept of what it is printing. It's just running some heaters and spinning some motors in response to gcode.

Since such a printer is incapable of determining whether or not this gcode represents a legislatively-restricted item and then blocking its production, then that machine becomes illegal to sell in New York. Easy-peasy. It just takes a quick vote or two and the stroke of a pen, and it is done.

You're probably thinking something like "But that doesn't work at all," and I agree. But sometimes legislators just don't care that they've thrown out the baby along with the bathwater.

jjk166 · 2026-02-04T13:34:37 1770212077

It depends how you define the problem. Certainly a human can look at a part and say "that's a lower reciever" but you probably can make something that functions as a firearm exclusively from inconspicuous parts. For the more limited case, an AI can definitely be trained, the broader case is likely unsolvable.

Mashimo · 2026-02-04T08:09:09 1770192549

Btw, AFAIK they also want to lock down the slicer.

fsloth · 2026-02-04T11:34:21 1770204861

How the hell can you do that.

GCODE is mostly about pure maths and geometry (well, there's other stuff but in principle). They would forbid math? "Euclid is illegal."

Mashimo · 2026-02-04T12:47:11 1770209231

If I remember correctly, they want the gcode to be watermarked. Or signed?

Maybe gcode watermarked and slicer signed? I can't remember. Something silly, that is for sure.

le-mark · 2026-02-04T01:20:16 1770168016

It’s not nearly that hard of a problem. There are n gun files on internet, so validate the hash of those n files (g code whatever). These people aren’t cadding their own designs.

snailmailman · 2026-02-04T04:56:47 1770181007

One big part of this is that gcode isnt really a 3d model its a set of instructions on how to move the printhead around. You don't download the gcode directly, because that varies by printer. You download a model, and then a slicing program turns that into a set of printer-specific gcode. Any subtle settings changes would change the hash of this gcode.

And the printer doesn't really know what the model is. It would have to reverse the gcode instructions back into a model somehow. The printer isn't really the place to detect and prevent this sort of thing imo. Especially with how cheap some 3d printers are getting, they often don't really have much compute power in them. They just move things around as instructed by the g-code. If the g-code is malformed it can even break the printer in some instances, or at least really screw up your print.

There are even scripts that modify the gcode to do weird things the printer really isn't designed for, like print something and then have the printer move in such a way to crash into and push the printed object off the plate, and then start over and print another print. The printer will just follow these instructions blindly.

WillAdams · 2026-02-04T11:42:07 1770205327

Given that quite simple G-code, say a pair of nested circles with code for tool changes/accessory activation, can make two wildly different parts depending on which machine it is run on:

- a washer if run on a small machine in metric w/ flood coolant

- a lamp base if run on a larger router in Imperial w/ a tool changer

and that deriving what will be made by a given G-code file in 3D is a problem which the industry hasn't solved in decades, the solution of which would be worthy of a Turing Award _and_ a Fields Medal, I don't see this happening.

A further question, just attempting it will require collecting a set of 3D models for making firearms --- who will persuade every firearms manufacturer to submit said parts, where/how will they be stored, and how will they be secured so that they are not used/available as a resource for making firearms?

A more reasonable bit of legislation would be that persons legally barred from owning firearms are barred from owning 3D printers and CNC equipment unless there is a mechanism to submit parts to their parole officer for approval before manufacturing, since that's the only class of folks which the 2nd Amendment doesn't apply to, and a reasonable argument is:

1st Amendment + 2nd Amendment == The Right to 3D Print and Bear Arms

throw3e98 · 2026-02-04T05:14:30 1770182070

Guns can be made out of simple geometric shapes like tubes, blocks, and simple machines like levers and springs. There is mathematically no way to distinguish a gun part from a part used in home plumbing - in fact you can go to the plumbing section of your local hardware store and buy everything you need to build a fully functional shotgun.

satiric · 2026-02-04T04:56:54 1770181014

The g-code is not being distributed, because it's specific to each printer, filament, etc. G-code is not the same thing as a STP or STL file.

beeflet · 2026-02-04T05:19:05 1770182345

In 3D modeling, there are parametric files where the end user is expected to modify the input parameters to fit their needs. So for example, if you have multiple parts that need to fit together, you may need to adjust the tolerances for that fit, because the physical shape will vary depending on your printer settings and material.

Making tiny modifications isn't just a method of circumvention, it's like part of the main workflow of using a 3d model.

mediaman · 2026-02-04T03:26:18 1770175578

Seems trivial to create an infinite number of inconsequentially (but hash defeating) different variants.

gmueckl · 2026-02-02T21:53:52 1770069232

This is an extremely bold claim and I think that it completely overlooks how Photoshop is used by professionals in practice. Professional users want extremely fine grained and precise control over their tools to achieve the specific results that they want. AI "image editing" is incapable of providing anything remotely similar.

coliveira · 2026-02-02T22:41:33 1770072093

Yes, "professional users" need this. The problem is that the group of professional users who need that will shrink really fast in the next few years.

gmueckl · 2026-02-02T21:49:41 1770068981

There is one big argument against these "good enough" solutions: commercial business software providers need to put a lot of R&D into finding generalized workflows that apply to as many clients as possible. Effectively, they find and encode current standard practices into their products. This is valuable from a business operations perspective in two ways: it's a good bet that transitioning the customer's operations to match the software is cleaning up internal processes, and it makes onboarding new employees easier because the tools and workflows should be much more familiar right from the start.

gmueckl · 2026-02-02T21:39:24 1770068364

I don't find this surprising. Code and data models encode the results of accumulated business decisions, but nothing about the decision making process or rationale. Most of the time, this information is stored only in people's heads, so any automated tool is necessary blind.

phatfish · 2026-02-02T22:28:37 1770071317

This captures succinctly the one of the key issues with (current) AI actually solving real problems outside of small "sandboxes" where it has all the information.

When an AI can email/message all the key people that have the institutional knowledge, ask them the right discovery questions (probably in a few rounds and working out which bits are human "hallucinations" that don't make sense). Collect that information and use it to create a solution. Then human jobs are in real trouble.

Until that AI is just a productivity boost for us.

datsci_est_2015 · 2026-02-03T02:14:53 1770084893

The AI will also have to be trained to be diplomatic and maybe even cunning, because, as I can personally attest, answering questions from an AI is an extremely grating and disillusioning experience.

There are plenty of workers who refuse to answer questions from a human until it’s escalated far enough up the chain to affect their paycheck / reputation. I’m sure that the intelligence is artificial will only multiply the disdain / noncompliance.

But then maybe there will be strategies for masking from where requests are coming, like a system that anonymizes all requests for information. Even so, I feel like there would still be a way that people would ping / walk up to their colleague in meatspace and say “hey that request came from me, thanks!”

gmueckl · 2026-01-30T17:49:40 1769795380

On a company-managed device?

wizzwizz4 · 2026-01-30T18:23:46 1769797426

It's more likely than you think.

pogue · 2026-01-30T18:37:18 1769798238

I'm sure it depends on the make/model and how locked down it is or if they even care

y-curious · 2026-01-31T14:19:25 1769869165

He was referencing this meme, friend: https://knowyourmeme.com/memes/its-more-likely-than-you-thin...

gmueckl · 2026-01-30T17:47:54 1769795274

I still expect this feature to roll out worldwide with some legalese fine print that the customer is responsible for configuring and operating the product "in accordance with local laws". I'd be really surprised if MS handles this differently.

gmueckl · 2026-01-30T05:21:47 1769750507

USB to RS233 adapters should still work for those unless there are really weird timig requirememts.