Hacker Newsnew | past | comments | ask | show | jobs | submit | sixhobbits's commentslogin

It's also formed similarly to oil over millions of years underground if I understand correctly so can be a byproduct of natural gas mining.

It's often found alongside natural gas because the rock structures that can trap methane can also trap other gasses, but the original source is different - thermal decomposition of organic matter for natural gas and radioactive decay, mostly of uranium and thorium, for helium.

I agree that the "accumulation over millions of years" is similar (and similarly a potential problem if we burn through all that accumulation).


I really enjoyed this oddlots podcast episode that covered similar points and had a lot of "wat" moments for me, including the US selling off its strategic helium reserves at a loss because politicians labeled it "party baloon reserve", and how long it takes to produce naturally and how hard it is to find, process and transport.

https://m.youtube.com/watch?v=bjc6MgUY0BE


Author here, yeah I think I changed my mind after reading all the comments here that this is related to the harness. The interesting interaction with the harness is that Claude effectively authorizes tool use in a non intuitive way.

So "please deploy" or "tear it down" makes it overconfident in using destructive tools, as if the user had very explicitly authorized something, and this makes it a worse bug when using Claude code over a chat interface without tool calling where it's usually just amusing to see


amazing example, I added it to the article, hope that's ok :)

not betting my entire operation - if the only thing stopping a bad 'deploy' command destroying your entire operation is that you don't trust the agent to run it, then you have worse problems than too much trust in agents.

I similarly use my 'intuition' (i.e. evidence-based previous experiences) to decide what people in my team can have access to what services.


I'm not saying intuition has no place in decision making, but I do take issue with saying it applies equally to human colleagues and autonomous agents. It would be just as unreliable if people on your team displayed random regressions in their capabilities on a month to month basis.

author here, interesting to hear, I generally start a new chat for each interaction so I've never noticed this in the chat interfaces, and only with Claude using claude code, but I guess my sessions there do get much longer, so maybe I'm wrong that it's a harness bug

I’ve done long conversations with ChatGPT and it really does start losing context fast. You have to keep correcting it and refeeding instructions.

It seems to degenerate into the same patterns. It’s like context blurs and it begins to value training data more than context.


Yes, and with very long chats, you'll see it even forget how to do things like make tool calls - or even respond at all! I've had ChatGPT reply with raw JSON, regurgitate an earlier prompt, reply with a single newline, regurgitate information from a completely different chat, reply in a foreign language, and more.

Things get really wacky as it approaches decoherence.


I’ve seen the raw json before. I didn’t realize that was an actual failure mode.

I’ve also had it fail to respond in long chats but I thought it was a network error despite having no error messages.


Yeah, the raw JSON (in my case) is the result of a failed tool call, it was trying to generate an image. With thinking models, you can observe the degeneration of its understanding of image tool calls over the lifetime of a chat. It eventually puzzles over where images are supposed to be emitted, how it's supposed to write text, if it's allowed to provide commentary - and eventually, it gets all of it wrong. This also happens with file cites (in projects) and web search calls.

author here - yeah maybe 'reasoning' is the incorrect term here, I just mean the dialogue that claude generates for itself between turns before producing the output that it gives back to the user

Yeah, that's usually called "reasoning" or "thinking" tokens AFAIK, so I think the terminology is correct. But from the traces I've seen, they're usually in a sort of diary style and start with repeating the last user requests and tool results. They're not introducing new requirements out of the blue.

Also, they're usually bracketed by special tokens to distinguish them from "normal" output for both the model and the harness.

(They can get pretty weird, like in the "user said no but I think they meant yes" example from a few weeks ago. But I think that requires a few rounds of wrong conclusions and motivated reasoning before it can get to that point - and not at the beginning)


There are a bunch of invisible characters that I used to build something similar a while back, pre LLMs, to hide state info in telegram messages to make bots more powerful

https://github.com/sixhobbits/unisteg


They had gold worth X to the market but X minus 11 billion on paper. So when France accounted for its gold in euro terms they would say they have X minus 11 billion Euros worth of gold.

Now they still have the same amount of gold but they "realized" a gain of 11 billion. They don't have that much cash left after the repurchase but now they say they have X Euros worth of gold which is 11 billion more than before.

So no they didn't make a profit from this as gold is higher on both sides of the Atlantic than last time they did their accounting updates.


> worth X to the market but X minus 11 billion on paper.

Why was it worth “X minus 11 billions”?


Probably based on the price they paid for it or when they last did some kind of asset accounting to calculate the Euro value of all assets held

It is quite difficult to leave a country without simultaneously entering another

It is trivial for any country that is not land-locked. You just have to sail to international waters. What is difficult is to stay there.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: