Hacker Newsnew | past | comments | ask | show | jobs | submit | 6thbit's commentslogin

Quotes should be around “journalism”.

Let’s recognize those bait posts as what they are, which for sure is not journalism.


Yeah having this opens up the LLM assisting path to build shortcuts. Which is great! Maintaining them by hand is not

Writing down specs for technical projects is a transformational skill.

I've had projects that seemed tedious or obvious in my head only to realize hidden complexity when trying to put their trivial-ness into written words. It really is a sort of meditation on the problem.

In the most important AI assisted project I've shipped so far I wrote the spec myself first entirely. But feeding it through an LLM feedback loop felt just as transformational, it didn't only help me get an easier to parse document, but helped me understand both the problem and my own solution from multiple angles and allowed me to address gaps early on.

So I'll say: Do your own writing, first.


So both this and litellm went straight to PyPI without going to GitHub first.

Is there any way to setup PyPI to only publish packages that come from a certain pattern of tag that exists in GH? Would such a measure help at all here?


Yes: if you use a Trusted Publisher with PyPI, you can constrain it to an environment. Then, on GitHub, you can configure that environment with a tag or branch protection rule that only allows the environment to be activated if the ref matches. You can also configure required approvers on the environment, to prevent anyone except your account (and potentially other maintainers you’d like) from activating the environment.

If they have compromised the token wouldn't that mean the developer is compromised and such access can be used to just put "curl whatever" into the build and publish that payload on pypi?

I don’t understand the question, sorry.

I'll try to reformulate in a simpler way.

On debian, all builds happen without internet access. So whatever ends up on the .deb file is either contained on the dependencies or in the orig tarball.

Is anything similar done for builds that create artifacts for pypi, so that a certain correspondence between binary file and sources exists? Or is there unrestricted internet access so that what actually ends up on pypi can come from anywhere and vetting the sources is of little help?


That’s a nice property of centralized package management systems; I don’t think anything exactly like that exists for PyPI. The closest thing would be a cryptographic attestation.

(If I wanted to taxonomize these things, I say that the Debian model is effectively a pinky promise that the source artifacts correspond to the built product, except that it’s a better pinky promise because it’s one-to-many instead of many-to-many like language package managers generally are. You can then formalize that pinky promise with keys and signatures, but at the end of the day you’re still essentially binding a promise.)


wasnt PEP 740 an attempt to solve this?

Depends on what you mean by “this.” If you mean build provenance, yes, if you mean transmuting PyPI into the kind of trust topology that Debian (for example) has, no.

(I think PEP 740 largely succeeds at providing build provenance; having downstream tooling actually do useful things with that provenance is harder for mostly engineering coordination reasons.)


Don't have the token on your hands. Use OICD ideally, or make sure to setup carefully as a repository secret. Ensure the workflow runs in a well permission read, minimal dependency environment. The issue with OICD is that it does not work with nested workflows because github does not propagate the claims.

*OIDC

Not clear to me the diff with v2?

They stacked the deck. If v2 was still rule inference + spatial reasoning, a bit like juiced up Raven's progressive matrices, then v3 adds a whole new multi-turn explore/exploit agentic dimension to it.

Given how hard even pure v2 was for modern LLMs, I'm not surprised to see v3 crush them. But that wouldn't last.


v2 was a static fill in the blank task instead of v3 which is interactive.

There's world state that you can change. Not just place pixel.

Here's v2:

https://arcprize.org/tasks/ce602527


Worth exploring safeguard for some: The automatic import can be suppressed using Python interpreter’s -S option.

This would also disable site import so not viable generically for everyone without testing.


It's not really "automatic import", as described. The exploit is directly contained in the .pth file; Python allows arbitrary code to run from there, with some restrictions that are meant to enforce a bit of sanity for well-meaning users and which don't meaningfully mitigate the security risk.

As described in https://docs.python.org/3/library/site.html :

> Lines starting with import (followed by space or tab) are executed.... The primary intended purpose of executable lines is to make the corresponding module(s) importable (load 3rd-party import hooks, adjust PATH etc).

So what malware can do is put something in a .pth file like

  import sys;exec("evil stringified payload")
and all restrictions are trivially bypassed. It used to not even require whitespace after `import`, so you could even instead do something like

  import_=exec("evil stringified payload")
In the described attack, the imports are actually used; the standard library `subprocess` is leveraged to exec the payload in a separate Python process. Which, since it uses the same Python environment, is also a fork bomb (well, not in the traditional sense; it doesn't grow exponentially, but will still cause a problem).

.pth files have worked this way since 2.1 (comparing https://docs.python.org/2.1/lib/module-site.html to https://docs.python.org/2.0/lib/module-site.html). As far as I can tell there was no PEP for that change.


The 1.82.7 exploit was executed on import. The 1.82.8 exploit used a pth file which is run at start up (module discovery basically).

title is bit misleading.

The package was directly compromised, not “by supply chain attack”.

If you use the compromised package, your supply chain is compromised.


It's both. They got compromised by another supply chain attack on Trivy initially.

> Subsequent to this solve, we finished developing our general scaffold for testing models on FrontierMath: Open Problems. In this scaffold, several other models were able to solve the problem as well: Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh).

Interesting. Whats that “scaffold”? A sort of unit test framework for proofs?


I think in this context, scaffolds are generally the harness that surrounds the actual model. For example, any tools, ways to lay out tasks, or auto-critiquing methods.

I think there's quite a bit of variance in model performance depending on the scaffold so comparisons are always a bit murky.


Usually involves a lot of agents and their custom contexts or system prompts.

> I didn’t learn anything. I felt like I was flinging slop over the wall to an open-source maintainer.

Well I’m sorry you feel that way, impostor syndrome is tough to deal with already without AI.

You seem to be driven by understanding and you have a great tool to learn from here if you make an effort over time to grasp the “slop” you’re throwing to the wall. Be curious, ask why several times and explore guilt free over time when you are in the right mindset.

I’m glad you got something useful out of it this time and also not everything you do with AI has to be useful or a final “deliverable”, it can also be a great toy and window into more understanding.


What’s the diff with the new text? Only the word “solicitation” removed?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: