It's often found alongside natural gas because the rock structures that can trap methane can also trap other gasses, but the original source is different - thermal decomposition of organic matter for natural gas and radioactive decay, mostly of uranium and thorium, for helium.
I agree that the "accumulation over millions of years" is similar (and similarly a potential problem if we burn through all that accumulation).
I really enjoyed this oddlots podcast episode that covered similar points and had a lot of "wat" moments for me, including the US selling off its strategic helium reserves at a loss because politicians labeled it "party baloon reserve", and how long it takes to produce naturally and how hard it is to find, process and transport.
Author here, yeah I think I changed my mind after reading all the comments here that this is related to the harness. The interesting interaction with the harness is that Claude effectively authorizes tool use in a non intuitive way.
So "please deploy" or "tear it down" makes it overconfident in using destructive tools, as if the user had very explicitly authorized something, and this makes it a worse bug when using Claude code over a chat interface without tool calling where it's usually just amusing to see
not betting my entire operation - if the only thing stopping a bad 'deploy' command destroying your entire operation is that you don't trust the agent to run it, then you have worse problems than too much trust in agents.
I similarly use my 'intuition' (i.e. evidence-based previous experiences) to decide what people in my team can have access to what services.
I'm not saying intuition has no place in decision making, but I do take issue with saying it applies equally to human colleagues and autonomous agents. It would be just as unreliable if people on your team displayed random regressions in their capabilities on a month to month basis.
author here, interesting to hear, I generally start a new chat for each interaction so I've never noticed this in the chat interfaces, and only with Claude using claude code, but I guess my sessions there do get much longer, so maybe I'm wrong that it's a harness bug
Yes, and with very long chats, you'll see it even forget how to do things like make tool calls - or even respond at all! I've had ChatGPT reply with raw JSON, regurgitate an earlier prompt, reply with a single newline, regurgitate information from a completely different chat, reply in a foreign language, and more.
Things get really wacky as it approaches decoherence.
Yeah, the raw JSON (in my case) is the result of a failed tool call, it was trying to generate an image. With thinking models, you can observe the degeneration of its understanding of image tool calls over the lifetime of a chat. It eventually puzzles over where images are supposed to be emitted, how it's supposed to write text, if it's allowed to provide commentary - and eventually, it gets all of it wrong. This also happens with file cites (in projects) and web search calls.
author here - yeah maybe 'reasoning' is the incorrect term here, I just mean the dialogue that claude generates for itself between turns before producing the output that it gives back to the user
Yeah, that's usually called "reasoning" or "thinking" tokens AFAIK, so I think the terminology is correct. But from the traces I've seen, they're usually in a sort of diary style and start with repeating the last user requests and tool results. They're not introducing new requirements out of the blue.
Also, they're usually bracketed by special tokens to distinguish them from "normal" output for both the model and the harness.
(They can get pretty weird, like in the "user said no but I think they meant yes" example from a few weeks ago. But I think that requires a few rounds of wrong conclusions and motivated reasoning before it can get to that point - and not at the beginning)
There are a bunch of invisible characters that I used to build something similar a while back, pre LLMs, to hide state info in telegram messages to make bots more powerful
They had gold worth X to the market but X minus 11 billion on paper. So when France accounted for its gold in euro terms they would say they have X minus 11 billion Euros worth of gold.
Now they still have the same amount of gold but they "realized" a gain of 11 billion. They don't have that much cash left after the repurchase but now they say they have X Euros worth of gold which is 11 billion more than before.
So no they didn't make a profit from this as gold is higher on both sides of the Atlantic than last time they did their accounting updates.
reply