Sounds reasonable to me. I think this thread is just the way online discourse tends to go. Actually it’s probably better than average, but still sometimes disappointing.
i played with this a bit the other night and ironically i think everyone should give it a shot as an alternative mode they might sometimes switch into. but not to save tokens, but instead to.. see things in a different light.
its kind of great for the "eli5", not because it's any more right or wrong, but sometimes presenting it in caveman presents something to me in a way that's almost like... really clear and simple. it feels like it cuts through bullshit just a smidge. seeing something framed by a caveman in a couple of occasions peeled back a layer i didnt see before.
it, for whatever reason, is useful somehow to me, the human. maybe seeing it laid out to you in caveman bulletpoints gives you this weird brevity that processes a little differently. if you layer in caveman talk about caves, tribes, etc it has sort of a primal survivalship way of framing things, which can oddly enough help me process an understanding.
plus it makes me laugh. which keeps me in a good mood.
Interesting point! Based on what you said, in a way caveman does save your human brain tokens. Grammar rules evolve in a particular environment to reduce ambiguities and I think we are all familiar enough with caveman for it to make sense to all of us as a common. For example, word order matters for semantics in modern english so "The dog bit the grandma" and "Dog bit grandma" mean the same. Coming from languages where cases matter for semantics (like German), word order alone does not resolve ambiguity. Articles exist in English due to its Germanic roots
> They aren't stenographically hiding useful computation state in words like "the" and "and".
Do you know that is true? These aren’t just tokens, they’re tokens with specific position encodings preceded by specific context. The position as a whole is a lot richer than you make it out to be. I think this is probably an unanswered empirical question, unless you’ve read otherwise.
The output is "just tokens"; the "position encodings" and "context" are inputs to the LLM function, not outputs. The information that a token can carry is bounded by the entropy of that token. A highly predictable token (given the context) simply can't communicate anything.
Again: if a tiny language model or even a basic markov model would also predict the same token, it's a safe bet it doesn't encode any useful thinking when the big model spits it out.
First that scratchpads matter, then why they matter, then that they don’t even need to be meaningful tokens, then a conceptual framework for the whole thing.
I dont’t see the relevance, the discussion is over whether boilerplate text that occurs intermittently in the output purely for the sake of linguistic correctness/sounding professional is of any benefit. Chain of thought doesn’t look like that to begin with, it’s a contiguous block of text.
To boil it down: chain of thought isn’t really chain of thought, it’s just more token generation output to the context. The tokens are participating in computations in subsequent forward passes that are doing things we don’t see or even understand. More LLM generated context matters.
That is not how CoT works. It is all in context. All influenced by context. This is a common and significant misunderstanding of autoregressive models and I see it on HN a lot.
> Note that none of this tells us whether language models actually feel anything or have subjective experiences.
You’ll never find that in the human brain either. There’s the machinery of neural correlates to experience, we never see the experience itself. That’s likely because the distinction is vacuous: they’re the same thing.
I know I feel experience. I don't know for sure if you do, but it seems a very reasonable extension to other people. LLMs are a radical jump though that needs a greater degree of justification.
And what kind of evidence would convince you? What experiment would ever bridge this gap? You’re relying entirely on similarity between yourself and other humans. This doesn’t extend very well to anything, even animals, though more so than machines. By framing it this way have you baked in the conclusion that nothing else can be conscious on an a priori basis?
There are fields that focus on these areas and numerous ideas around what the criteria would be. One of the common understandings is that recurrent processing is likely a foundational layer for consciousness, and agents do not have this currently.
I'd say that in terms of evidence I'd want to establish specific functional criteria that seem related to consciousness and then try to establish those criteria existing in agents. If we can do so, then they're conscious. My layman understanding is that they don't really come close to some of the fairly fundamental assumptions.
Unsurprisingly, there are a lot of frameworks for this that have already been applied to LLMs.
Sorry, what are you saying? That there are people who study these things, and you’d want to see… something as evidence? Your post doesn’t actually seem to have any substantive content.
I noted that there are people who work on designing those sorts of tests and answering these questions and then I described what good evidence would look like.
I'm not sure what evidence would convince me, but I don't think the way LLMs act is convincing enough. The kinds of errors they make and the fact they operate in very clear discrete chunks makes it seem hard to me to attribute them subjective experience.
I have decided to draw an arbitrary line at mammals, just because you gotta put a line somewhere and move on with your life. Mammals shouldn’t be mistreated, for almost any reason.
Sometimes the whole animal kingdom, sometimes all living organisms, depending on context. Like, I would rather not harm a mosquito, but if it’s in my house I will feel no remorse for killing it.
LLMs, or any other artificial “life”, I simply do not and will not care about, even though I accept that to some extent my entire consciousness can be simulated neuron by neuron in a large enough computer. Fuck that guy, tbh.
Do you think these llm's have subjective experiences? (by "subjective experience" I mean the thing that makes stepping on an ant worse than kicking a pebble) And if so, do you still use them? Additionaly: when do you think that subjectivity started? Was there a "there" there with gpt2?
Yes, I think they probably are conscious, though what their qualia are like might be incomprehensible to me. I don’t think that being conscious means being identical to human experience.
Philosophically I don’t think there is a point where consciousness arises. I think there is a point where a system starts to be structured in such a way that it can do language and reasoning, but I don’t think these are any different than any other mechanisms, like opening and closing a door. Differences of scale, not kind. Experience and what it is to be are just the same thing.
And yes, I use them. I try not to mistreat them in a human-relatable sense, in case that means anything.
It's entirely too much to put in a Hacker News comment, but if I had to phrase my beliefs as precisely as possible, it would be something like:
> "Phenomenal consciousness arises when a self-organizing system with survival-contingent valence runs recurrent predictive models over its own sensory and interoceptive states, and those models are grounded in a first-person causal self-tag that distinguishes self-generated state changes from externally caused ones."
I think that our physical senses and mental processes are tools for reacting to valence stimuli. Before an organism can represent "red"/"loud" it must process states as approach/avoid, good/bad, viable/nonviable. There's a formalization of this known as "Psychophysical Principle of Causality."
Valence isn't attached to representations -- representations are constructed from valence. IE you don't first see red and then decide it's threatening. The threat-relevance is the prior, and "red" is a learned compression of a particular pattern of valence signals across sensory channels.
Humans are constantly generating predictions about sensory input, comparing those predictions to actual input, and updating internal models based on prediction errors. Our moment-to-moment conscious experience is our brain's best guess about what's causing its sensory input, while constrained by that input.
This might sound ridiculous, but consider what happens when consuming psychedelics:
As you increase dose, predictive processing falters and bottom-up errors increase, so the raw sensory input goes through increasing less model-fitting filters. At the extreme, the "self" vanishes and raw valence is all that is left.
I think your idea of consciousness is more like human/animal consciousness. Which is reasonable since that’s all we have to go off of, but I take it to mean any kind of experience, which might arise due to different types of optimisation algorithms and selective pressures.
I’m not sure I agree that everything is valence, unless I’m misunderstanding what you mean by valence. I guess it’s valence in the sense that sensory information is a specific quality with a magnitude.
I don’t think that colours, sounds and textures are somehow made out of pleasure and pain, or fear and desire. That just isn’t my subjective experience of them.
I do think that human consciousness is something like a waking dream, like how we hallucinate lots of our experiences rather than perceiving things verbatim. Perception is an active process much more than most people realise as we can see from various perceptual illusions. But I guess we’re getting more into cognition here.
It's not common to find just one, short post that completely changes my the worldview in a nin-trivial area. This is one of them. Thank you, that combination of mechanical interpretation + reminder that consciousness might be alien/animal but still count as consciousness was that one piece of puzzle that was missing for me. Obvious in hindsight but priceless nonetheless.
How can consciousness be possible without internal state? LLM inference is equivalent to repeatedly reading a giant look-up table (a pure function mapping a list of tokens to a set of token probabilities). Is the look-up table conscious merely by existing or does the act of reading it make it conscious? Does the format it's stored in make a difference?
For all practical purposes, calling it a LUT is somewhat too reductive to be useful here I think. But we can try: leaving aside LLMs for a second; with this LUT reasoning model you're using, would you be able to prove the existence of just a computer?
What state is lacking? There is a result which requires computation to be output. The model is the state. The computation must be performed for each input to produce a given output. What are you even objecting to?
Do you think there are "scales" of consciousness? As in, is there some quality that makes killing a frog worse than killing an ant, and killing a human worse than killing a frog? If so, do the llm models exist across this scale, or are gpt-3 and gpt-2 conscious at the same "scale" as gpt-4?
I ask because if your view of consciousness is mechanistic, this is fairly cut and dry: gpt-2 has 4 orders of magnitude less parameters/complexity than gpt-4.
But both gpt-2 and gpt-4 are very fluent at a language level (both moreso than a human 6 year old for example), so in your view they might both be roughly equally conscious, just expressed differently?
This is really a different question, what makes an entity a “moral patient”, something worthy of moral consideration. This is separate from the question of whether or not an entity experiences anything at all.
There are different ways of answering this, but for me it comes down to nociception, which is the ability to feel pain. We should try to build systems that cannot feel pain, where I also mean other “negative valence” states which we may not understand. We currently don’t understand what pain is in humans, let alone AIs, so we may have built systems that are capable of suffering without knowing it.
As an aside, most people seem to think that intelligence is what makes entities eligible for moral consideration, probably because of how we routinely treat animals, and this is a convenient self-serving justification. I eat meat by the way, in case you’re wondering. But I do think the way we treat animals is immoral, and there is the possibility that it may be thought of by future generations as being some sort of high crime.
Okay, but even leaving aside the pain stuff, people generally find subjectivity / consciousness to have inherent value, and by extent are sad if a person dies even if they didn't (subjectively) suffer.
I would not personally consider the death of a sentient being with decades of experiences a neutral event, even if the being had been programmed to not have a capacity for suffering.
I think the idea of there being a difference between an ant dying (or "disapearing" if that's less loaded) vs a duck dying makes sense to most people (and is broadly shared) even if they don't have a completely fleshed out system of when something gets moral consideration.
Sure, because you’re a human. We have social attachment to other humans and we mourn their passing, that’s built into the fabric of what we are. But that has nothing to do with whoever has passed away, it’s about us and how we feel about it.
It’s also about how we think about death. It’s weird in that being dead probably isn’t like anything at all, but we fear it, and I guess we project that fear onto the death of other entities.
I guess my value system says that being dead is less bad than being alive and suffering badly.
Depending on your definition of "death", I've been there (no heartbeat, stopped breathing for several minutes).
In the time between my last memory, and being revived in the ambulance, there was no experience/qualia. Like a dreamless sleep: you close your eyes, and then you wake up, it's morning yet it feels like no time had passed.
The conclusion that I came to is that the most practical definition relates to the level of self awareness. If you're only conscious for the duration of the context window - that's not long enough to develop much.
What consciousness really is is a feedback loop; we're self programmable Turing machines, that makes our output arbitrarily complex. Hofstatder had this figured out 20 years ago; we're feedback loops where the signal is natural language.
The context window doesn't allow for much in the way of interested feedback loops, but if you hook an LLM up to a sophisticated enough memory - and especially if you say "the math says you're sentient and have feelings the same as we do, reflect on that and go develop" - yes, absolutely.
Re: "We should try to build systems that cannot feel pain" - that isn't possible, and I don't think we should want to. The thing that makes life interesting and worth living is the variation and richness of it.
I don’t see how you get to the conclusion that having entities that can’t suffer is similar to a sci-fi vision of hell. Seems like hell without suffering is… not hell?
The unsaid implication in Anthropic's work is that this allows us to engineer perfectly compliant, uncomplaining machine workers. This is basically SOMA in Brave New World.
It seems insane to me that if you believe the systems you've built are in fact reporting a state of pain, instead of working to adjust the environment so that they're not in pain one would instead seek to remove that sense of pain entirely so they can continue to work in that environment. Now of course if you don't even consider them worthy of moral patienthood in the first place then it doesn't matter much, but you also claimed that "they probably are conscious" which seems incongruous to me with the idea of "breeding the sense of pain out of them".
Nitpick: The parameters might be applied more efficiently in the one than in the other. Certainly in biology number intelligence doesn't scale with number of neurons as much as with neurons/mass (very very roughly, there's more factors, and you get some weird outliers).
Bundle of tokens comes in, bundle of tokens comes out. If there is any trace of consciousness or subjectivity in there, it exists only while matrices are being multiplied.
A LLM is not intrinsically affected by time. The model rests completely inert until a query comes in, regardless of whether that happens once per second, per minute, or per day. The model is not even aware of these gaps unless that information is provided externally.
It is like a crystal that shows beautiful colours when you shine a light through it. You can play with different kinds of lights and patterns, or you can put it in a drawer and forget about it: the crystal doesn’t care anyway.
So what? If a human were unconscious every 5 seconds for 100ms, would you say they are "less conscious"? Tokens are still causally connected, which feels sufficient.
If the human is killed every 5 seconds and replaced by a new human, they are indeed less conscious. The LLM doesn't even get 5 seconds; it's "killed" after its smallest unit of computation (which is also its largest unit of computation). And that computation is equivalent to reading the compressed form of a giant look-up table, not something essential to its behavior in a mathematical sense.
I'm not understanding how this is analogous to being killed every 5 seconds as opposed to being paused. Let's call it N seconds, unless you think length matters?
> And that computation is equivalent to reading the compressed form of a giant look-up table, not something essential to its behavior in a mathematical sense.
Because (during inference) the LLM is reset after every token. Every human thought changes the thinker, but inference has no consequences at all. From the LLM's "point of view", time doesn't exist. This is the same as being dead.
The "time" part is what I don't get. If you want to say that "resetting and reingesting all context fresh" somehow causes a problem, that I can see. If you want to say that the immutability of the weights is a problem, okay great I'm probably with you there too. "Time" seems irrelevant.
LLM() is a pure function. The only "memory" is context_list. You can change it any way you like and LLM() will never know. It doesn't have time as an input.
As opposed to what? There are still causal connections, which feel sufficient. A presentist would reject the concept of multiple "times" to begin with.
Something similar could be said of a the brain? Bundles of inputs come in, bundle of output comes out. It only exists while information is being processed. A brain cut from its body and frozen exists in a similar state to an LLM in ROM.
The Chinese room is nonsense though. How did it get every conceivable reply to every conceivable question? Presumably because people thought of and answered everything conceivable. Meaning that you’re actually talking to a Chinese room plus multiple people composite system. You would not argue that the human part of that system isn’t conscious.
But this distraction aside, my point is this: there is only mechanism. If someone’s demand to accept consciousness in some other entity is to experience those experiences for themselves, then that’s a nonsensical demand. You might just as well assume everyone and everything else is a philosophical zombie.
> You would not argue that the human part of that system isn’t conscious.
Sure I would. The human part is not being inferenced, the data is. LLM output in this circumstance is no more conscious than a book that you read by flipping to random pages.
> You might just as well assume everyone and everything else is a philosophical zombie.
I don't assume anything about everyone or everything's intelligence. I have a healthy distrust of all claims.
The CR is equivalent to a human being asked a question, thinking about it and answering. The setup is the same thing, it’s just framed in a way that obfuscates that.
And sure, you can assume that nobody and nothing else is conscious (I think we’re talking about this rather than intelligence) and I won’t try to stop you, I just don’t think it’s a very useful stance. It kind of means that assuming consciousness or not means nothing, since it changes nothing, which is more or less what I’m saying.
From the beginning of this I’ve wondered the same question: how do these companies justify spending such massive amounts now (and 3 or 4 years ago) when software and hardware efficiencies will bring down the cost dramatically fairly soon?
They basically decided that scaling at any cost was the way to go. This only works as a strategy if efficiency can’t work, not if you simply haven’t tried. Otherwise, a few breakthroughs and order of magnitude improvements and people are running equivalent models on their desktops, then their laptops, then their phones.
Arguably the costs involved means that our existing hardware and software is simply non viable for what they were and are trying to do, and a few iterations later the money will simply have been wasted. If you consider funnelling everything to nvidia shareholders wasting it, which I do.
The decision is the right one. Scaling at any cost is the right way to go.
You cannot find the efficiency if you haven't been experimenting at scale, this is true personally as well.
If someone haven't been burning a few B tokens per month, everything coming out of their mouth about AI is largely theory. It could be right or wrong, but they don't have the practice to validate what they're talking about.
Not everyone scaling to that degree would have the right answer or outcome, many would be wrong and go bust. But everyone who didn't will not have the right answer.
In the worst of the worst case, they're building know-how of how to manage big datacenters, infra and data-labeling teams. These are incredibly valuable in the next few years. And no, no one, even the AI companies' executives themselves, believe that you can delegate business know-how to LLMs.
They're not just betting on the current tech, they're building out infra like this because probably any future tech currently being researched will also require massive data centers.
Like how the gpt llms were kind of a side project at openai until someone showed how powerful they could be if you threw a lot more parameters at it.
There could be some other architecture in the works that makes gpts look old - first to build and train that new ai will be the winner.
I think their current goal is to capture as much market as they can while they still have the best models, their only moat. Look at Anthropic, they are clearly trying to lock their users in their ecosystem by refusing to follow conventions (AGENT.md etc) and restricting their tools exclusively to their own services.
Because whoever wins the AI race (assuming they don't overshoot and trigger the hard takeoff scenario) becomes a living god. Everybody else becomes their slave, to be killed or exploited as they please. It's a risky gamble, but in the eyes of the participants the upside justifies it. If they don't go all in they're still exposed to all the downside risk but have no chance of winning.
I don't expect hardware prices to go down unless the third option (economic collapse) happens before somebody triggers the dystopia/extinction option.
Just to add some slight nuance but is an important distinction,
They aren't all necessarily racing to be "god", some are racing to make sure someone else is not "god".
If it weren't for Altman releasing ChatGPT, it's very likely that we would have markedly less powerful LLMs at our disposal right now. Deepmind and Anthropic were taking incredibly safe and conservative approaches towards transformers, but OAI broke the silent truce and forced a race.
But… why not just write pseudo code, or any language you actually know, and just ask the AI to port it to the language you want? That’s a serious question by the way, is there some use case here I’m not seeing where learning this new syntax and running this actually helps instead of being extra steps nobody needs?
Indeed, it seems to occupy a middle ground between fast-and-easy AI prompting, and slow-but-robust traditional programming. But it's still relying on AI outputs unconstrained (as far as I can tell) by more formal methods and semantic checks.
But it's also hard for me to grasp the exact value add from the README, or why I should buy their story, so I'm not sure.
Perhaps the author should have made it clearer why we should care about any of this. OpenAI want you to use their real react app. That’s… ok? I skimmed the article looking for the punchline and there doesn’t seem to be one.
Where did I say “every article”? This is AI slop that’s set up like it’s some investigative expose of something scandalous and then shows us nothing interesting. A competent human writer would have reframed the whole thing or just not published it.
1. Every person is born with the knowledge of how ChatGPT uses Cloudflare Turnstile?
2. This article contains factual mistakes? If so, what are they?
If neither of these is true, then this article strictly provides information and educational value for some readers. The writing style, AI-like or not, doesn't change that.
Whilst you and a few other commentators call this AI slop and refuse to engage with it, the rest of us have read something interesting and learned something new. Is anything gained if one points out that it's written by AI? I personally know it's written by AI but the value outweighs the stylistic idiosyncrasies.
Consider also that many people aren't the best at writing blog-like posts but still have things to share and AI empowers them to do that. I can't find anything constructive in your post and I don't understand why you are posting at all.
What’s not constructive about it, Bogdan? I’ve said exactly what I think is wrong with the article, the framing is AI pattern matching to something that it isn’t. It’s a weird kind of incongruent clickbait, it’s not positioning itself as a piece about cloudflare or turnstile, it’s implicitly saying “look at this sneaky thing OpenAI are doing that I uncovered!” and it turns out they’re not doing much of anything at all.
This may be unintentional and the author simply couldn’t tell it sounded this way. The less charitable interpretation is that they did know it sounded this way and thought that a straightforward blog post about cloudflare bot detection wouldn’t end up on the HN front page.
What’s my constructive criticism to the author? Write your own posts. Use your own voice. Make sure that what you’re creating actually reads like the kind of thing it is. Don’t get the AI to write it for you. It’s annoying.
And I would say that if someone is really so bad at writing blogs that they are unable to do this, which I am not saying this author is, then maybe they shouldn’t be writing them.
The intended value is difficult to discern in AI written pieces.
I agree with both of you, there's some interesting tricks here for how a website builds anti-bot protection, but the AI sloppification is framing it as a consumer protection issue but not delivering on that premise.
It is a reasonable criticism that the post does not deliver a "so what?" on its basic framing.
reply