Hacker Newsnew | past | comments | ask | show | jobs | submit | e1g's commentslogin

You can use an alias, which takes priority over $PATH. e.g. I have this in .zhsrc to override the "claude" executable to run it in the OS sandbox:

    alias claude="sandbox-exec -f ~/agents-jail.sb ~/.local/bin/claude --dangerously-skip-permissions"

How does your sandbox ruleset look? I've been using containers on Linux but I don't have a solution for macOS.

Here's my ruleset https://gist.github.com/eugene1g/ad3ff9783396e2cf35354689cc6...

My goal is to prevent Claude from blowing up my computer by erasing things it shouldn't touch. So the philosophy of my sanboxing is "You get write access to $allowlist, and read access to everything except for $blocklist".

I'm not concerned about data exfiltration, as implementing it well in a dev tool is too difficult, so my rules are limited to blocking highly sensitive folders by name.


That's neat. I'm going to base my ruleset off of yours. I've been messing around with claude more and more lately and I need to do something.

> Here's my ruleset ...

Thank you for sharing a non-trivial working example of a sandbox-exec configuration. Having an exemplar such as what you have kindly shared is hugely beneficial for those of us looking to see what can be done with a tool such as this.


Thank you - you inspired me to open-source this work properly -> https://eugene1g.github.io/agent-safehouse/

> Thank you - you inspired me to open-source this work properly

It is both myself and the OSS community which thank you.

  Great things are done by a series of small things brought 
  together.[0]
0 - https://www.brainyquote.com/quotes/vincent_van_gogh_120866

Wow, this is nearly an exact copy of Codex Monitor[1]: voice mode, project + threads/agents, git panel, PR button, terminal drawer, IDE integrations, local/worktree/cloud edits, archiving threads, etc.

[1] https://github.com/Dimillian/CodexMonitor


Codex Monitor seems like an Antigravity Agent Manager clone. It came out after, too.

Bunch of the features u listed were already in the codex extension too. False outrage it its finest.


I have both Codex Monitor and this new Codex app open side by side right now; aside from the theme, I struggle to tell them apart. Antigravity's Agent Manager is obviously different, but these two are twins.

I have a very hard time getting worked up over this. There are a ton of entrants in this category, they all generally look the same. Cribbing features seems par for the course.

Antigravity is a white labeled $2B pork of Windsurf, so it really starts there, but maybe someone knows what windsurf derived from to keep the chain going?

cursor?

from what I can tell, the people behind windsurf were at it first

oh, codeium? that was them?

Maybe github copilot then


was langchain before copilot?

Yes, Codex compaction is in the latent space (as confirmed in the article):

> the Responses API has evolved to support a special /responses/compact endpoint [...] it returns an opaque encrypted_content item that preserves the model’s latent understanding of the original conversation


Is this what they mean by "encryption" - as in "no human-readable text"? Or are they actually encrypting the compaction outputs before sending them back to the client? If so, why?


"encrypted_content" is just a poorly worded variable name that indicates the content of that "item" should be treated as an opaque foreign key. No actual encryption (in the cryptographic sense) is involved.


This is not correct, encrypted content is in fact encrypted content. For openai to be able to support ZDR there needs to be a way for you to store reasoning content client side without being able to see the actual tokens. The tokens need to stay secret because it often contains reasoning related to safety and instruction following. So openai gives it to you encrypted and keeps the keys for decrypting on their side so it can be re-rendered into tokens when given to the model.

There is also another reason, to prevent some attacks related to injecting things in reasoning blocks. Anthropic has published some studies on this. By using encrypted content, openai and rely on it not being modified. Openai and anthropic have started to validate that you're not removing these messages between requests in certain modes like extended thinking for safety and performance reasons


Are you sure? For reasoning, encrypted_content is for sure actually encrypted.


Hmmm, no, I don't know this for sure. In my testing, the /compact endpoint seems to work almost too well for large/complex conversations, and it feels like it cannot contain the entire latent space, so I assumed it keeps pointers inside it (ala previous_response_id). On the other hand, OpenAI says it's stateless and compatible with Zero Data Retention, so maybe it can contain everything.


They say they do not compress the user messages, but yeah, it's purpose is to do very lossy compression of everything else. I'd expect it to be small.


Ah, that makes more sense. Thanks!


+1 from another happy Whispr Flow power user. I tried 4-5 similar apps and even built one with Assembly AI, but Whispr is a significant upgrade above the rest for correctly recognizing my accent and jargon. Having the custom vocabulary helps.


Do you happen to have a link with a more nuanced technical analysis of that (emergent) behavior? I’ve read only the pop-news version of that “escaping” story.


There is none. We don't understand LLMs well enough to be able to conduct a full fault analysis like this.

We can't trace the thoughts of an LLM the way we can trace code execution - the best mechanistic interpretability has to offer is being able to get glimpses occasionally. The reasoning traces help, but they're still incomplete.

Is it pattern-matching? Is it acting on its own internal goals? Is it acting out fictional tropes? Were the circumstances of the test scenarios intentionally designed to be extreme? Would this behavior have happened in a real world deployment, under the right circumstances?

The answer is "yes", to all of the above. LLMs are like that.


You might have missed the appendix the Anthropic blog post linked to, which has additional detail.

https://www.anthropic.com/research/agentic-misalignment

https://assets.anthropic.com/m/6d46dac66e1a132a/original/Age...


Knowing who is behind this campaign, 90% chance the extra white space, the graffiti, and this article were all commissioned by them intentionally.

Luckily this did not translate to sales, or we’d have another wave of Cluely BS copycats.


The cynic always sounds smart and remains poor.


I think the issue is more that a lot of people weren't cynical enough. I knew Bitcoin was a shit currency when I first heard about it, and thought that was all there was to it. I didn't understand that while it was a shit currency, it was a great speculative asset. I thought people would look at it, go "that's dumb", and move on. Apparently I hadn't heard of, or understood the Dutch tulip mania and similar historical events. I presumed people would be better than they turned out to be, and that cost me a lot of potential capital gains.


> The cynic always sounds smart and remains poor.

Why would cynics be poor? The OP mentioned "techies", many of whom have jobs paying 6 figures a year.


The Cynic on the other hand knows how to enjoy life with just enough. He is free, a spy for the gods.


But he keeps writing and talking about people who have more than enough, and how they are wrong.


He points out their foolishness - he has what they will never have. Enough.


It takes no imagination or insight to see reasons why something wouldn’t work. It’s the default mental pathway for every risk-averse beast. Skepticism is not born out of contentment and abundance but out of self-preservation. It’s not correlated with feeling enough, but with feeling bitterness and envy of those who took risks and gained an advantage instead of suffering consequences.


People who are content feel less need to take risks by accepting dubious statements without proof. They have what they need so why risk it for more?

Sceptical people will be grounded by what we know to be true. They will explore new ideas but will not be swept up by them. We need people like that or we'll waste our time on flights of fancy. But we need the irrational optimists to explore new ideas too. It's a classic exploration vs exploitation trade-off.


Many people who have risked their money by placing it on Bitcoin likely had enough, and they risked the extra money that they had lying around. Why not place bets on something you think might be probable? Is there something morally wrong in making some extra buck? Is it morally superior just to keep your money lying on bank account or what?


To have enough by your definition and to feel like one has enough are two very different ideas of enough.

The Cynic has enough if he has his cloak and found some food in the garbage can. He feels like he has enough. You might feel like that's not enough.

Conversely I might think the richest man in the world (by net worth) has enough. He feels like he needs more.


I'm pretty sure these peeps who hang out at /r/buttcoin are going to work like regular people to get some fiat currency to their beloved government blessed bank accounts. So I guess they don't feel like they have enough.


I have no idea what a buttcoin is, sorry.


To be honest I don't think the skeptical people thought bitcoin's success was probable and that's why they didn't bet on it. It's not really anything to do with them being content with what they have.

But it could be this too in some cases.

Some people do things unless they find a reason not to but so a skeptical person will only do things if they find a reason.

People who really feel they have enough might not see any reason to spend their time or effort placing bets, even on things they think are probable. But I don't think many people think that way.


> It takes no imagination or insight to see reasons why something wouldn’t work. It’s the default mental pathway for every risk-averse beast.

Quite the opposite: it takes a lot of strong will and risk to talk against a hype. A kind of risk-affinity that unluckily rarely makes you rich. :-(


To remove resulting notifications, see instructions here https://github.com/orgs/community/discussions/174283#discuss...

These spam repositories have been deleted, but I still had lingering notifications stuck on GitHub, and I couldn't see them in the UI to remove them (but the small blue notification dot was constantly on). The API hack resolved this problem.


Came here looking for this. Thank you - removed the annoying blue notification now.


Extremely slow for me - takes minutes to get anything done. Regular GPT5 was much faster. Hoping it’s mostly due to the launch day.


I've been using gpt-5 on effort=high but for gpt-5-codex, try: `-c model_reasoning_effort=medium`.

On high it is totally unusable.


even on medium ... gpt-5 was way faster, at least that's my first impression


> clean/erase/undo/soft-delete/hard-delete mistakes[...] make the change tracking capable of time travel itself [...] Transitioning to an EAV

I just finished building out all of that + more (e.g., data lineage, multi-verse, local overrides, etc), also on PG. Reach out if you want to chat and get nerd sniped!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: