Hacker Newsnew | past | comments | ask | show | jobs | submit | exegeist's commentslogin

Impressive prediction, especially pre-ChatGPT. Compare to Gary Marcus 3 months ago: https://garymarcus.substack.com/p/reports-of-llms-mastering-...

We may certainly hope Eliezer's other predictions don't prove so well-calibrated.


Gary Marcus is so systematically and overconfidently wrong that I wonder why we keep talking about this clown.


People just give attention to people making surprising bold counter narrative predictions but don't give them any attention when they're wrong.


People like him and Zitron do serve a useful purpose in balancing the hype from the other side, which, while justified to a great extent, is often a bit too overwhelming.


Being wrong in the other direction doesn't mean you've found a great balance, it just means you've found a new way to be wrong.


These numbers feel kind of meaningless without any work showing how he got to 16%


I do think Gary Marcus says a lot of wrong stuff about LLMs but I don’t see anything too egregious in that post. He’s just describing the results they got a few months ago.


He definitely cannot use the original arguments from then ChatGPT arrived, he's a perennial goal post shifter.


My understanding is that Eliezer more or less thinks it's over for humans.



Technical strengths aside, I’ve been impressed with how non-robotic Kimi K2 is. Its personality is closer to Anthropic’s best: pleasant, sharp, and eloquent. A small victory over botslop prose.


I have a different experience in chatting/creative writing. It tends to overuse certain speech patterns without repeating them verbatim, and is strikingly close to the original R1 writing, without being "chaotic" like R1 - unexpected and overly dramatic sci-fi and horror story turns, "somewhere, X happens" at the end etc.

Interestingly enough, EQ-Bench/Creative Writing Bench doesn't spot this despite clearly having it in their samples. This makes me trust it even less.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: