My goal is to prevent Claude from blowing up my computer by erasing things it shouldn't touch. So the philosophy of my sanboxing is "You get write access to $allowlist, and read access to everything except for $blocklist".
I'm not concerned about data exfiltration, as implementing it well in a dev tool is too difficult, so my rules are limited to blocking highly sensitive folders by name.
Thank you for sharing a non-trivial working example of a sandbox-exec configuration. Having an exemplar such as what you have kindly shared is hugely beneficial for those of us looking to see what can be done with a tool such as this.
Wow, this is nearly an exact copy of Codex Monitor[1]: voice mode, project + threads/agents, git panel, PR button, terminal drawer, IDE integrations, local/worktree/cloud edits, archiving threads, etc.
I have both Codex Monitor and this new Codex app open side by side right now; aside from the theme, I struggle to tell them apart. Antigravity's Agent Manager is obviously different, but these two are twins.
I have a very hard time getting worked up over this. There are a ton of entrants in this category, they all generally look the same. Cribbing features seems par for the course.
Antigravity is a white labeled $2B pork of Windsurf, so it really starts there, but maybe someone knows what windsurf derived from to keep the chain going?
Yes, Codex compaction is in the latent space (as confirmed in the article):
> the Responses API has evolved to support a special /responses/compact endpoint [...] it returns an opaque encrypted_content item that preserves the model’s latent understanding of the original conversation
Is this what they mean by "encryption" - as in "no human-readable text"? Or are they actually encrypting the compaction outputs before sending them back to the client? If so, why?
"encrypted_content" is just a poorly worded variable name that indicates the content of that "item" should be treated as an opaque foreign key. No actual encryption (in the cryptographic sense) is involved.
This is not correct, encrypted content is in fact encrypted content. For openai to be able to support ZDR there needs to be a way for you to store reasoning content client side without being able to see the actual tokens. The tokens need to stay secret because it often contains reasoning related to safety and instruction following. So openai gives it to you encrypted and keeps the keys for decrypting on their side so it can be re-rendered into tokens when given to the model.
There is also another reason, to prevent some attacks related to injecting things in reasoning blocks. Anthropic has published some studies on this. By using encrypted content, openai and rely on it not being modified. Openai and anthropic have started to validate that you're not removing these messages between requests in certain modes like extended thinking for safety and performance reasons
Hmmm, no, I don't know this for sure. In my testing, the /compact endpoint seems to work almost too well for large/complex conversations, and it feels like it cannot contain the entire latent space, so I assumed it keeps pointers inside it (ala previous_response_id). On the other hand, OpenAI says it's stateless and compatible with Zero Data Retention, so maybe it can contain everything.
+1 from another happy Whispr Flow power user. I tried 4-5 similar apps and even built one with Assembly AI, but Whispr is a significant upgrade above the rest for correctly recognizing my accent and jargon. Having the custom vocabulary helps.
Do you happen to have a link with a more nuanced technical analysis of that (emergent) behavior? I’ve read only the pop-news version of that “escaping” story.
There is none. We don't understand LLMs well enough to be able to conduct a full fault analysis like this.
We can't trace the thoughts of an LLM the way we can trace code execution - the best mechanistic interpretability has to offer is being able to get glimpses occasionally. The reasoning traces help, but they're still incomplete.
Is it pattern-matching? Is it acting on its own internal goals? Is it acting out fictional tropes? Were the circumstances of the test scenarios intentionally designed to be extreme? Would this behavior have happened in a real world deployment, under the right circumstances?
The answer is "yes", to all of the above. LLMs are like that.
I think the issue is more that a lot of people weren't cynical enough. I knew Bitcoin was a shit currency when I first heard about it, and thought that was all there was to it. I didn't understand that while it was a shit currency, it was a great speculative asset. I thought people would look at it, go "that's dumb", and move on. Apparently I hadn't heard of, or understood the Dutch tulip mania and similar historical events. I presumed people would be better than they turned out to be, and that cost me a lot of potential capital gains.
It takes no imagination or insight to see reasons why something wouldn’t work. It’s the default mental pathway for every risk-averse beast. Skepticism is not born out of contentment and abundance but out of self-preservation. It’s not correlated with feeling enough, but with feeling bitterness and envy of those who took risks and gained an advantage instead of suffering consequences.
People who are content feel less need to take risks by accepting dubious statements without proof. They have what they need so why risk it for more?
Sceptical people will be grounded by what we know to be true. They will explore new ideas but will not be swept up by them. We need people like that or we'll waste our time on flights of fancy. But we need the irrational optimists to explore new ideas too. It's a classic exploration vs exploitation trade-off.
Many people who have risked their money by placing it on Bitcoin likely had enough, and they risked the extra money that they had lying around. Why not place bets on something you think might be probable? Is there something morally wrong in making some extra buck? Is it morally superior just to keep your money lying on bank account or what?
I'm pretty sure these peeps who hang out at /r/buttcoin are going to work like regular people to get some fiat currency to their beloved government blessed bank accounts. So I guess they don't feel like they have enough.
To be honest I don't think the skeptical people thought bitcoin's success was probable and that's why they didn't bet on it. It's not really anything to do with them being content with what they have.
But it could be this too in some cases.
Some people do things unless they find a reason not to but so a skeptical person will only do things if they find a reason.
People who really feel they have enough might not see any reason to spend their time or effort placing bets, even on things they think are probable. But I don't think many people think that way.
These spam repositories have been deleted, but I still had lingering notifications stuck on GitHub, and I couldn't see them in the UI to remove them (but the small blue notification dot was constantly on). The API hack resolved this problem.
> clean/erase/undo/soft-delete/hard-delete mistakes[...] make the change tracking capable of time travel itself [...] Transitioning to an EAV
I just finished building out all of that + more (e.g., data lineage, multi-verse, local overrides, etc), also on PG. Reach out if you want to chat and get nerd sniped!
reply