Hacker Newsnew | past | comments | ask | show | jobs | submit | mtlynch's commentslogin

> 5 out of 1000+ reports to be valid is statistically worse than running a fuzzer on the codebase.

Carlini said "hundreds" of crashes, not 1000+.

It's not that only 5 were true positives and the rest were false positives. 5 were true positives and Carlini doesn't have bandwidth to review the rest. Presumably he's reviewed more than 5 and some were not worth reporting, but we don't know what that number is. It's almost certainly not hundreds.

Keep in mind that Carlini's not a dedicated security engineer for Linux. He's seeing what's possible with LLMs and his team is simultaneously exploring the Linux kernel, Firefox,[0] GhostScript, OpenSC,[1] and probably lots of others that they can't disclose because they're not yet fixed.

[0] https://www.anthropic.com/news/mozilla-firefox-security

[1] https://red.anthropic.com/2026/zero-days/


OP here.

I don't understand this critique. Carlini did use Claude Code directly. Claude Code used the Claude Opus 4.6 model, but I don't know why you'd consider it inaccurate to say Claude Code found it.

GPT 5.4 might be capable of finding it as well, but the article never made any claims about whether non-Anthropic models could find it.

If I wrote about achieving 10k QPS with a Go server, is the article misleading unless I enumerate every other technology that could have achieved the same thing?


Also, he did compare with earlier versions that, before 4.5, were dramatically worse at finding the same problems. There's even a graph. That seems to pretty solidly support the idea that this is "gain of function" as it were...

> What is not mentioned is that Claude Code also found one thousand false positive bugs, which developers spent three months to rule out.

Source? I haven't seen this anywhere.

In my experience, false positive rate on vulnerabilities with Claude Opus 4.6 is well below 20%.


To the issue of AI submitted patches being more of a burden than a boon, many projects have decided to stop accepting AI-generated solutioning:

https://blog.devgenius.io/open-source-projects-are-now-banni...

These are just a few examples. There are more that google can supply.


According to Willy Tarreau[0] and Greg Kroah-Hartman[1], this trend has recently significantly reversed, at least form the reports they've been seeing on the Linux kernel. The creator of curl, Daniel Steinberg, before that broader transition, also found the reports generated by LLM-powered but more sophisticated vuln research tools useful[2] and the guy who actually ran those tools found "They have low false positive rates."[3]

Additionally, there was no mention in the talk by the guy who found the vuln discussed in the TFA of what the false positive rate was, or that he had to sift through the reports because it was mostly slop — or whether he was doing it out of courtesy. Additionally, he said he found only several hundred, iirc, not "thousands." All he said was:

"I have so many bugs in the Linux kernel that I can’t report because I haven’t validated them yet… I’m not going to send [the Linux kernel maintainers] potential slop, but this means I now have several hundred crashes that they haven’t seen because I haven’t had time to check them." (TFA)

He quite evidently didn't have to sift through thousands, or spend months, to find this one, either.

[0]: https://lwn.net/Articles/1065620/ [1]: https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_... [2]: https://simonwillison.net/2025/Oct/2/curl/p [3]: https://joshua.hu/llm-engineer-review-sast-security-ai-tools...


No, they haven't. Read the ai slop you posted carefully.

It's a policy update that enables maintainers to ignore low effort "contributions" that come from untrusted people in order to reduce reviewing workload.

An Eternal September problem, kind of.


Didn't you just restate what the parent claimed?

No, that's not at all the same thing: ai-generated contributions from people with a track record for useful contributions are still accepted.

Right. AI submissions are so burdensome that they have had to refuse them from all except a small set of known contributors.

The fact that there’s a small carve out for a specific set of contributors in no way disputes what Supermancho claimed.


A powertool that needs discretion and good judgement to be used well is being restricted to people with a track record of displaying good judgement. I see nothing wrong here.

AI enables volume, which is a problem. But it is also a useful tool. Does it increase review burden? Yes. Is it excessively wasteful energy wise? Yes. Should we avoid it? Probably no. We have to be pragmatic, and learn to use the tools responsibly.


I never said anything is wrong with the policy. Or with the tool use for that matter.

This whole chain was one person saying “AI is creating such a burden that projects are having to ban it”, someone else being willfully obtuse and saying “nuh uh, they’re actually still letting a very restricted set of people use it”, and now an increasingly tangential series of comments.


I feel like you're still failing to grasp the point.

The only difference is that before AI the number of low effort PRs was limited by the number of people who are both lazy and know enough programming, which is a small set because a person is very unlikely to be both.

Now it's limited to people who are lazy and can run ollama with a 5M model, which is a much larger set.

It's not an AI code problem by itself. AI can make good enough code.

It's a denial of service by the lazy against the reviewers, which is a very very different problem.


No one is missing your point. The issue is that you are responding a point no one made.

The grounding premise of this comment chain was “AI submitted patches being more of a burden than a boon”. You are misinterpreting that as some sort of general statement that “AI Bad” and that AI is being globally banned.

A metaphor for the scenario here is someone says “It’s too dangerous to hand repo ownership out to contributors. Projects aren’t doing that anymore.” And someone else comes in to say “That’s not true! There are still repo owners. They are just limiting it to a select group now!” This statement of fact is only an interesting rebut if you misinterpret the first statement to say that no one will own the repo because repo ownership is fundamentally bad.

> It's a denial of service by the lazy against the reviewers, which is a very very different problem.

And it is AI enabling this behavior. Which was the premise above.


Yes, but technically no different than "good contributions from humans are still accepted, AI slop can fuck off".

Since the onus falls on those "people with a track record for useful contributions" to verify, design tastefully, test and ensure those contributions are good enough to submit - not on the AI they happen to be using.

If it fell on the AI they're using, then any random guy using the same AI would be accepted.


Same. Codex and Claude Code on the latest models are really good at finding bugs, and really good at fixing them in my experience. Much better than 50% in the latter case and much faster than I am.

Source: """AI is bad"""

In my experience, the issue has been likelihood of exploitation or issue severity. Claude gets it wrong almost all the time.

A threat model matters and some risks are accepted. Good luck convincing an LLM of that fact


In TFA:

   I have so many bugs in the Linux kernel that I can’t 
   report because I haven’t validated them yet… I’m not going 
   to send [the Linux kernel maintainers] potential slop, 
   but this means I now have several hundred crashes that they
   haven’t seen because I haven’t had time to check them.
    
    —Nicholas Carlini, speaking at [un]prompted 2026

Those aren't false positives; they're results he hasn't yet inspected.

I wrote a longer reply here: https://news.ycombinator.com/item?id=47638062


>Those aren't false positives; they're results he hasn't yet inspected.

It's not a XOR


The article quote was being given as the supposed source for "Claude Code also found one thousand false positive bugs, which developers spent three months to rule out", so should substantiate that claim - which it doesn't.

If the claim was instead just "a good portion of the hundreds more potential bugs it found might be false positives", then sure.


Yes it is. They're not not false positives until they're reported and consume maintainer time.

False positives can be eliminated mechanistically by testing if they actually work, in a sufficiently isolated automated test apparatus.

The hard thing is reducing detected crashes to well-formulated test cases that help rather than hinder maintainers.


some of them certainly are…

The comment said "Claude Code also found one thousand false positive bugs, which developers spent three months to rule out.".

Please explain how a bug can both be unvalidated, and also have undergone a three month process to determine it is a false positive?


That's missing the point.

Ultimately, companies who use H1B visas will outcompete companies who don't because the H1B system gives them cheaper labor costs. The solution has to come at the regulatory level.

150 years ago, if you told someone "oh if you want safer factories just build one yourself," that business would never survive because they'd get outcompeted by the less scrupulous factory owners who were happy to mangle their employees and just replace them with more desperate workers.


Your factory analogy is great because that's exactly how it happened and we outsourced everything to China.

It's significantly easier to outsource white collar work.

And you can keep playing the regulation game until there are no companies left.


Right but you act like nothing can be done about that. If you outsource enough then don't count on the US government to protect you overseas. Go ahead and risk nationalization. The very fact that we are having this discussion shows how bad the situation is: companies can effectively threaten to move all their jobs overseas, thereby threatening US workers with economic ruin. That's not okay.

The government grants broad liability shields to owners of companies because there is a vested state interest in facilitating commerce/economic growth. I guess if companies are just going to move overseas then maybe those liability shields could just be vacated. They don't deserve to have their cake and eat it too. It's easy: want the liability shield? Stop being fucking greedy and be a good corporate citizen. Otherwise, no cake for you.


That seems like a weird tangent.

If you give companies the ability to choose US government protection against their overseas operations being nationalized versus the ability to hire foreigners on work visas, the overwhelming majority of companies will choose the latter.


I don't understand what you're advocating.

Do you think it's a bad thing that the US implemented occupational safety laws?

I agree that it's not great that many of those risks just shifted overseas, but it's certainly a net positive that American employers can no longer let workers die or get permanently injured and just let the workers absorb those costs.

The H1B visa system isn't just a natural part of capitalism that I want the government to regulate. It's an artificial condition created by bad regulations. You can argue that we shouldn't have immigration restrictions at all since they're an artificial economic constraint, but that's a whole other argument.


I don't think it's specifically bad to have occupational safety laws, but overregulation in general has a choking effect.

By the time it'll take you to navigate the system to build anything physical in the US, you can have two iterations of the product in China.

The US way of handling this to go per incident and make one more rule, no matter how improbable that situation is. Eventually you end up with a system that needs a team of lawyers.


You can apply the same argument for illegal immigration and import from countries like China which flout labor and environmental standards.

But people who oppose H1B don’t seem opposed to that.


> Ultimately, companies who use H1B visas will outcompete companies who don't because the H1B system gives them cheaper labor costs. The solution has to come at the regulatory level.

I have utmost respect of your work, a customer of your fantastic product and have been meaning to reach out to you for a while (infact I learned about Cory from your blog in 2021) but I had to push back hard on this.

TinyPilot didn't happen in India nor China. I can argue it would have been cheaper to build it at any one of those countries but you know much better about it than I.

Labor costs only matter when you're selling an absolute commodity that has no edge than price.

Of all the people I would have expected to say that the solution has to come at the regulatory level given the experience, success you've had, with your transparency in how your company was doing, I am utterly surprised it was you.

I am more than happy to continue, reached out - I just wish our initial email would have been way more pleasant!


Thanks for the kind words!

To clarify, I certainly agree it's possible for a business to succeed without using H1B visas, especially for something small at the the scale of TinyPilot (7 people when I sold).

I just mean on a large scale, the companies that use H1B visas will generally outcompete the ones that don't.

What's the cost difference between a US citizen and an H1B? I'd guess it's something around 20% less expensive to hire an H1B visa holder. In an industry like software where the dominant cost is labor, then H1B companies have a 20% advantage over non-H1B companies. Non-H1B companies can outcompete them by being 20% better, but that's a big disadvantage to overcome.

Running my business actually made me oppose H1B visas more. The H1B visa system gives big businesses a massive advantage over small businesses. There's so much frictional cost to hiring someone on an H1B visa (legal fees, admin overhead) that it's not practical if you're only hiring 1-2 employees, but you'll get ROI if you're hiring 10-20. But it just gives an advantage to bigger business, and the advantage wouldn't exist if the H1B system didn't exist or if the government designed it to be employer-agnostic.


> Thanks for the kind words!

You deserve and earned them!

> There's so much frictional cost to hiring someone on an H1B visa (legal fees, admin overhead) that it's not practical if you're only hiring 1-2 employees

Very true but as you saw in my email, I have extremely experienced friends back in India who I have been able to hire as contractors without issue. No H1B - just plain old Slack, email and Forgejo. The playbook for asynchronous work is well tested and debugged by now. 2019 was a blessing.

I will concede, this doesn't work for every company - a hardware or biotech company definitely would appreciate people all being together in the same physical lab, in which case I hear you!

> the government designed it to be employer-agnostic

... but the government cannot be employer-agnostic, Michael.

The government is not an impartial, unbiased mainframe running in a DC somewhere. It's a group of people accepting and pushing policy who can be influenced, just like I am influencing your today, and you, me.

As an SMB and bootstrapped founder, you then have to choose between spending your time and efforts on being at the influencing table vs making actual design and business decisions at your startup the moment you yield influence to this group.

The bigger business simply doesn't have to make that decision. So don't help tip the scales further against yourself and SMBs like you.

That's one of my points that I was hoping to discuss in our email - that involving the government adds further overhead, resistance and expense into the system, so we should exhaust all other options before we even consider it. I personally have never seen an option that needed government intervention that couldn't be solved by the free market. I don't work in healthcare or education or finance - maybe those do require government intervention - I am entirely unfamiliar about those domains and not talking about them.

The other interpretation of being employer-agnostic is that the H1B isn't tied to a "sponsoring company" and doesn't require any of the transfer shenanigans. Sure, but the issue it isn't that way is because it's a rare "dual intention" visa in that, you are a non-immigrant who can become a citizen through the H1B. This was a feature added to the H1B to entice top quality talent. The problem with making the H1B employer-agnostic is that now you can I can start a perfectly legal, fantastic lifestyle businesses hiring H1Bs, petitioning for their greencards and immediately letting them go. As long as they can figure out a way to eat and sleep, they can now become citizens. So for it to be employer-agnostic, we need to remove the "dual intention" - the very carrot employers use (if you tough it out through all those JIRA tickets, you'll get to be a citizen!)

> I just mean on a large scale, the companies that use H1B visas will generally outcompete the ones that don't.

This is where I continue to push back. I was hoping to discuss over email but do you feel you could have built TinyPilot at either MS or Google, not as a side project but as an official product offering? I don't want to get too tied up into the specific features that TinyPilot offers - I'm using it as a proxy for a very useful, innovative product that provably solves real customer problems.

At the large scales where H1B makes sense, you as a major decision maker at the company wouldn't allow a worker with a risky status like the H1B be responsible for high impact, meaningful pieces of work. Actually, forget H1Bs - at the large scales where H1B makes sense, you would simply not entrust a single individual, H1B or not, with high impact, meaningful pieces of work.

If we disagree on this take, please say so - I am here to learn and listen.

The original intention of the H1B was to handle temporary supply shocks in knowledge work while the U.S. slowly fixed those supply constraints on its own.

If avocado toasts became an overnight sensation, the H1B was a way to provide breathing room to local avocado farms so the demand could sustain or grow (and not collapse) while they came up to speed to meet that sudden demand.

The H1B wasn't designed to be a way to absolutely wipe out local avocado farms because it's cheaper to just import avocados from Mexico.

The H1B has completely diverged from that and going in the opposite direction where it's actively and negatively impacting domestic markets. Massive corporations in Asia have grown whose sole business model is exploiting this geographical arbitrage and nothing else.

What piece of critical, useful software that has had a mention on HN can you name or recall that has come out of these many mutibillion dollar outsourcing giants?

A $60k/yr salary as a resident doctor is fantastic if you did most of your education in Asia but if you attended medical school anywhere in the U.S. and didn't have a 100% scholarship, you're starting your life off in crippling debt.

During COVID, there was an explosion of domestic coding bootcamps to address the supply constraints - this is precisely the kind of domestic corrections we, as the U.S. need to encourage and develop local talent, get them educated and motivated about tech, but these bootcamps require an investment that in Asia covers education, boarding and lodging without any scholarship. There's just no competition when it comes to cost. We in the U.S. have an extremely high quality of living and our CoL reflects that. As I wrote in my email, things we take for granted here - running water (not potable, just water that you could water your plants with), 24x7 electricity and internet - these are still unavailable where I was born, so of course, the CoL is cheaper. Way cheaper.

One might say, "OK then, free markets for the win" - that itself is a separate debate on its own.


>> the government designed it to be employer-agnostic

> The other interpretation of being employer-agnostic is that the H1B isn't tied to a "sponsoring company" and doesn't require any of the transfer shenanigans.

Right, this is what I was talking about.

I think the current system gives H1B employers way too much leverage over H1B employees and degrades the job market for everyone. Employers can tell H1Bs that they have to work evenings and weekends or be fired and leave the country. And then the same employer can turn around and tell US citizen employees that they also have to work evenings and weekends because the H1B employees are doing it. They have less leverage over the citizens because the citizens can get another job more easily, but forcing H1Bs to establish precedent definitely does pressure other employees, and I've seen this happen directly.

> So for it to be employer-agnostic, we need to remove the "dual intention" - the very carrot employers use (if you tough it out through all those JIRA tickets, you'll get to be a citizen!)

I think you could design it without such an obvious loophole, but I agree that there are probably loopholes no matter how you design it.

That said, I'm a bit confused about our disagreement at this point.

I think the H1B system is a net negative for the US economy, and it disproportionately hurts small businesses. I'd be in favor of a revised H1B system that allows companies to fill short-term labor shortages with foreign workers but with limits that prevent companies from abusing the system to depress wages and conditions for US workers, as they currently do with H1B today.

It sounds like your argument is that H1B doesn't matter because the companies using it aren't really innovating and so they'll naturally be outcompeted by smaller businesses who are too small to take advantage of the H1B system. Is that correct?

Also, I'm confused because you're saying you advocate free market solutions and that's why we shouldn't mess with the H1B system. The H1B system is the opposite of a free market solution. It's extra regulation that we'd be better off without.


Michael, Thank You for reading and responding back.

We disagree on the diagnosis and path forward, not the symptoms. Let me explain:

> I think the current system gives H1B employers way too much leverage over H1B employees and degrades the job market for everyone

Correct and it's by design. The overhead of H1B needs to be in the black when it hits the bottomline - the H1B isn't a charity auction to take brilliant engineers from developing countries and move them into the U.S. - it's to ensure the companies turn a profit on it.

> I think the H1B system is a net negative for the US economy, and it disproportionately hurts small businesses

Correct again on both accounts. While it is a brilliant solution 1% of the time, it's misused 99% of the time.

> It sounds like your argument is that H1B doesn't matter because the companies using it aren't really innovating and so they'll naturally be outcompeted by smaller businesses who are too small to take advantage of the H1B system. Is that correct?

Correct again

> you're saying you advocate free market solutions and that's why we shouldn't mess with the H1B system. The H1B system is the opposite of a free market solution. It's extra regulation that we'd be better off without

Correct again

Where I disagree with you is when you said we need to add more regulation to the existing H1B system. To me, and this is not something I'm hearing for the first time, it sounds like a band aid on top of a bunch of band aids 12 feet deep.

The H1B system was wonderful when it was initially implemented. The U.S. was undergoing massive technological shifts leading to tremendous supply shocks - just like how we are struggling to purchase GPUs, RAM and SSD today. As much as we disagree, history has proven repeatedly this shock will pass (unless the market is distorted by new regulation).

However the H1B has long since distorted into a geo-arbitrage, QoL and CoL hack. Even the current administration's $100k fee is a bandaid. Just the fully loaded cost of a 4-6 year domestic education is more than that, so the domestic supply is being destroyed.

Ideally what should happen is we should decommission the H1B completely, no ifs and buts, and have a "cool down" period where we notice what impact it actually has on the domestic demand. I do appreciate that this isn't accounting for the case where there are only a 1000 people worldwide who know how to train an LLM from scratch, or, 1000 toptier cardiologists worldwide that we would like to attract - I'm very sure we will figure something out for them but I argue we need to discover why our domestic supply is lacking in the first place instead of continuing to rely on band aids.

The U.S. today seems to rely on stents to save it from heart attacks. We should probably take a strong, hard look at the diet and lifestyle choices instead of continuing to rely on stents as a savior. It was necessary when we had an emergency, but a sustained reliance on emergency intervention points to underlying structural issues.


> If a company has the capability to hire literal people to waste your time, they can deploy more AI than you to waste the time of your AI.

I don't totally buy OP's argument, but I think you're dismissing it unfairly.

His point is that in the pre-LLM world, if a company wants to waste your time, they can hire a call center employee overseas for US$4/hr and make you wait for an hour to talk to them for 30 minutes. If you value your time at $80/hr, then the 90-minute call cost you $120, but it only cost the company about $2, thus the asymmetry.

OP's claim is that now, the asymmetry is gone. If both you and the company try to use AI, the company has less leverage to impose costs on you. They can deploy more AI to waste more of your time, but that means the asymmetry now is in the customer's favor because it costs the company more than the customer to get support.


But the company can deploy economies of scale to make their AI chatbots defeat yours? Company A can still hire someone in India to continually change their chat bot protocol or whatever to make it difficult for you chat bot to succeed in getting support.

How is the company achieving economies of scale with AI chatbots? The dominant cost is token cost, and even if they're a $1B company, they're not getting that much better of a deal on tokens or GPUs than a regular consumer.

Also, remember that the company is still constrained by having to support real humans who don't have their own AI. They can't just decide not to offer any support at all or or force infinite wait times, or they'd be doing that today.


>The by far nastiest part is CI. GitHub has done an excellent job luring people in with free macOS runners and infinite capacity for public repos.

This was my biggest blocker as well, as there weren't any managed CIs that supported Codeberg until recently.

NixCI[0] recently added support for Codeberg, and I've had a great experience with it. The catch is that you have to write your CI in Nix, though with LLMs, this is actually pretty easy. Most of my CI jobs are just bash scripts with some Nix wiring on top.[1] It also means you can reproduce all your CI jobs locally without changing any code.

[0] https://nix-ci.com

[1] https://codeberg.org/mtlynch/little-moments/src/commit/d9856... - for example


It's a bit weird to see someone doing free work for the community and then ask them to do even more free work.


Cool! This is something I've been thinking about and wanted to build too.

I don't want it to be like I'm a valuable person and lower people pay for the privilege of my attention, but I do like the idea of making it so that senders have skin in the game and can't just infinitely generate emails that waste other people's time.

What I'd like to see is different costs based on how I classify the email.

So, everyone except trusted contacts pays $5 per email to me. If I think your email was pure spam, I keep the $5. If I reply, you get your money back. If I do nothing and never classify the email, you get back $4 after 30 days. And I can manually override like reply and keep the money, but those are the defaults.


$5 is probably too much, tho. I'd be looking more at the $.2 to $1 range.

Maybe a 3 to 4 tier inbox. Known and trusted user being able to contact you without paying, a high value inbox for the $1+ range, a low value inbox for the $.2 range emails wont be auto-deleted in and a very low value inbox emails will be deleted in depending on the amount paid, with free mails being gone within e.g. an hour, all the way up to e.g. a month for $.19 mails.

Then unify those inboxes and set up notifications to the users' likings.

Also, I'd normalize e.g. 10% going to the e-mail service providers and enshrine that amount into the protocol right away. Otherwise the protocol wont get a lot of attention from the major providers and if it does, the provider taking his share is going to become normalized anyway. But then the split isn't going to be in favor for the users. Which isn't negative per-se, but it'd be nice to have at least one type of service where this is split is reversed. And it is fair to assume whoever takes the larger split has more influence on the prices, potentially either making this feature useless or pricing very casual users out of the service.


Hey Flavius! Thanks for reading and the kind words!


I'm obviously biased as a small business owner, but I think that logic assumes that the market is perfectly efficient, when it obviously isn't. Large companies have massive advantages in so many dimensions.

As a simple example, imagine that I built a site for buying ebooks that's better in every way than Amazon. I pay authors more, readers pay less, the ebooks are compatible with every device, and it's easier for both authors and readers to use my site than Amazon. I still probably couldn't survive against Amazon because they'd tell their authors that if they sell with me, they can't sell on Amazon.[0] They have such a market dominance that authors would lose money by using my platform, even if it's a demonstrably better product in every way with better pricing.

But it goes beyond that. Big businesses have all these other huge advantages so that they can succeed not because they're offering the most value but because they have a pre-existing advantageous market position:

- It's a small percentage of their costs to hire attorneys to look for tax loopholes

- They can manage the overhead of abusing the H1B visa system to hire workers at below-market rates

- They can sue people and get sued and still have 98% of their employees not paying attention to any lawsuits

- They can afford to sell things at a loss just to choke out smaller competitors

Look at trillion dollar industries where 95% of money goes to just 2-3 companies. The iOS/Android duopoly, the Visa/Mastercard duopoly. Do they control the market because they're just so great at offering value? Or does their market position and terrible government policy prevent anyone from competing with them effectively and offering consumers better choices?

[0] https://www.reuters.com/legal/transactional/amazon-must-face...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: