I don't like wading into this debate when semantics are very personal/subjective...

ggm · on Feb 7, 2025

"Expand the architecture" .. "get something much more powerful" .. "more dilithium crystals, captain"

Like I said elsewhere in this overall thread, we've been here before. Yes, you do see improvements in larger datasets, weighted models over more inputs. I suggest, I guess I believe (to be more honest) that no amount of "bigger" here will magically produce AGI simply because of the scale effect.

There is no theory behind "more" and that means there is no constructed sense of why, and the absence of abstract inductive reasoning continues to say to me, this stuff isn't making a qualitative leap into emergent anything.

It's just better at being an LLM. Even "show your working " is pointing to complex causal chains, not actual inductive reasoning as I see it.

gsam · on Feb 7, 2025

And that's actually a really honest answer. Whereas someone of the opposite opinion might be like parroting in the general copying-template sense actually generalizes to all observable behaviours because templating systems can be turing-complete or something like that. It's templates-all-the-way-down, including complex induction as long as there is a meta-template to match on its symptoms it can be chained on.

Induction is a hard problem, but humans can skip infinite compute time (I don't think we have any reason to believe humans have infinite compute) and still give valid answers. Because there's some (meta)-structure to be exploited.

Architecturally if machines / NN can exploit this same structure is a truer question.

visarga · on Feb 7, 2025

> this stuff isn't making a qualitative leap into emergent anything.

The magical missing ingredient here is search. AlphaZero used search to surpass humans, and the whole Alpha family from DeepMind is surprisingly strong, but narrowly targeted. The AlphaProof model uses LLMs and LEAN to solve hard math problems. The same problem solving CoT data is being used by current reasoning models and they have much better results. The missing piece was search.

UniverseHacker · on Feb 7, 2025

I'm sure both of you know this, but "stochastic parrot" refers to the title of a research article that contained a particular argument about LLM limitations that had very little to do with parrots.

danielmarkbruce · on Feb 7, 2025

The term is much more broadly known than the content of that (rather silly) paper.... I'm not even certain that it's the first use of the term.

ggm · on Feb 7, 2025

https://books.google.com/ngrams/graph?content=Stochastic%2C+...

ggm · on Feb 7, 2025

And the word "hallucination" ... has very little to do with...

HappMacDonald · on Feb 8, 2025

But it's far easier for human parrots to parrot the soundbyte "stochastic parrot" as a thought-terminating cliche.