Hacker Newsnew | past | comments | ask | show | jobs | submit | jazzpush2's commentslogin

That wasn't even remotely rigorous?

FWIW Zero-3 refers to a common strategy for sharding model components across GPUs (commonly called FSDP-2, Full Sharded Data Parallel). The "3" is the level of sharding (how much stuff to distribute across GPUs, e.g. just weights, versus optimizer state as well, etc.)

A Codon-based model is cool. I know NVIDIA is building quite a large one.

At GTC they showed an SAE they built on a smaller version of it, allowing you to see what their model learned: https://research.nvidia.com/labs/dbr/blog/sae/


It's important to note that pathological liars don't stop lying. In fact, when they're caught lying red-handed, they usually double down and lie even more.

I also assume these damage control type missives to be very misleading. Seen so many of these on HN over the years.

Also: you'd expect a compliance company to understand basic software licensing, especially the most popular.

The CEO is a clear scammer. How anyone trusts another word out of her math is beyond belief.

Pretty disgusting behavior from the founders just posting as normal on linkedin/twitter as if this is run-of-the-mill. Fraudsters need to be nipped in the bud, lest we get trump-like scenarios.

That reddit thread is brutal, knowingly making interviewees pay hundreds of dollars to interview in this economy is messed up.

Finally, my years of playing Starcraft have real-world use! Also: Everyone will soon bow to S. Korea :D

Just gotta worry about the network lag

Right? At least they're easy to identify (and subsequently close).


> At least they're easy to identify

For better or worse, I found this one different.

Usually I see a solid wall of black, but this one was actually readable with scripts disabled.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: