More

elijahbenizzy · 2025-04-24T19:10:34 1745521834

And then says... "you're welcome"

elijahbenizzy · 2025-04-10T15:10:53 1744297853

Ok do this but for JavaScript

voidUpdate · 2025-04-10T15:22:43 1744298563

https://en.wikipedia.org/wiki/JSFuck

elijahbenizzy · 2025-03-29T19:20:33 1743276033

This is amazing! I love that it requires very fancy hardware that is well designed. It's good someone finally made a chess game appropriate for the tiktok generation.

elijahbenizzy · 2025-02-26T17:09:32 1740589772

Really excited about this! Congrats on the launch. Ships make sense as a first target, but I'm curious -- do you see a future in which we have household fission reactors? E.G. power an entire house (city block, etc...) with fission reactors?

jtcohen · 2025-02-26T17:19:34 1740590374

Thank you! Household fission reactors: my take is that from a technical perspective we could definitely do it. It's more from a proliferation and nuclear waste perspective, will it be allowed and accepted by the public? Not sure, maybe though.

echoangle · 2025-02-26T17:21:27 1740590487

If that's a concern, how do you solve that for shipping? What if some somali pirate steals your fusion ship? Would they have to have armed protection (on top of the guards they already have, that's probably not enough when nuclear proliferation is the issue)?

jtcohen · 2025-02-26T17:26:12 1740590772

With fusion, there is no Uranium or Plutonium or highly radioactive materials. The main concern is Tritium which is a categorically reduced concern from enriched Uranium (but still needs to be secured and accounted for).

dghlsakjg · 2025-02-26T17:40:16 1740591616

Is the tritium a different isotope than the one I can have shipped to my house on the internet?

https://tritiumworkshop.com/products/megaglow-tritium-marker...

jtcohen · 2025-02-26T18:35:04 1740594904

Nope, that's it! We do need vastly larger quantities though, on the order of a kilogram.

mclau156 · 2025-02-26T17:30:29 1740591029

why do we continue to discount solar and batteries for home use?

dghlsakjg · 2025-02-26T17:45:28 1740591928

The argument that I've heard is that roof installed solar is incredibly expensive compared to all other solar. Add in the other compromises with orientation and obstructed sunlight, and you quickly realize that it is likely better to install solar and batteries at dedicated power facilities that scale better than to distribute the infrastructure in residential neighborhoods.

elijahbenizzy · on Jan 28, 2025

We’ve just learned that it’s possible to do AI on less compute (deepseek). if OpenAI doesn’t scale and that’s the problem then I’d argue that in the long run, if you believe in their ability to do research, then the news this week is a very bullish sign.

IMO the equivalent of moores law for AI (both on software and hardware development) is baked into the price, which doesn’t make the valuation all too crazy.

refulgentis · on Jan 28, 2025

> We’ve just learned that it’s possible to do AI on less compute (deepseek).

There's a huge motte and bailey thing with DeepSeek conversation, where the bailey is "It only took $5.5 million!*" (* for exactly one training run for one of several models, at dirt-cheap per-hour spot prices for H100s) and the motte is all sorts of stuff.

Truth is one run for one model took 2048 GPUs fulltime for 2 months, and my experience with FAANG ML, that means it took 6 months part-time and another 1.5-2.5 runs went absolutely nowhere.

_heimdall · on Jan 28, 2025

> is baked into the price, which doesn’t make the valuation all too crazy.

Valuations for most large companies have been crazy for a while now. No one values a company based on fundamentals anymore, its all pure gambling on future predictions.

This isn't unique to OpenAI by any means, but they are a good example. Last I checked their revenue to valuation multiplier was in the range of 42X. That's crazy.

Animats · on Jan 28, 2025

Does anyone know how Deepseek does it yet?

drakenot · on Jan 28, 2025

(Summary from Reddit)

- fp8 instead of fp32 precision training = 75% less memory

- multi-token prediction to vastly speed up token output

- Mixture of Experts (MoE) so that inference only uses parts of the model not the - entire model (~37B active at a time, not the entire 671B), increases efficiency

- PTX (basically low-level assembly code) hacking in old Nvidia GPUs to pump out as much performance from their old H800 GPUs as possible

Then, the big innovation of R1 and R1-Zero was finding a way to utilize reinforcement learning within their LLM training.

impossiblefork · on Jan 28, 2025

They also use some kind of factorized attention that somehow leads to compression of tokens (I still haven't read their papers, so I can't be clearer than this).

SlightlyLeftPad · on Jan 28, 2025

Honestly, I’m not sure I’m completely sold on the value of LLMs long term but this is the most realistic and reasonable take I’ve read on this post so far.

If anything, it’s an downward adjustment in the cost implications but could actually unlock exponential improvements on a shorter time horizon than expected because of that. Investors getting scared probably is a good opportunity to buy in.

markvdb · on Jan 28, 2025

Bullish on the use. Bearish on the profit margins for the big players.

If (big if!) I understand correctly, the ceiling for edge/local/offline AI has just blown off.

benatkin · on Jan 28, 2025

Is there an acronym for edge/local/offline? ELO could be confused with something AI already dominates at. As someone working in the edge/local/offline space it’s interesting to hear these together though. Offline is local but local often isn’t offline :)

kleene_op · on Jan 28, 2025

Bullish on the prospects for small players, then.

FeepingCreature · on Jan 28, 2025

It's always been possible to "do (worse) AI on less compute". We've had years of open models! I also don't understand how anyone can see this as anything but good news for OpenAI. The ultimate value proposition of AI has always depended on whether it stretches to AGI and beyond, and R1 demonstrates that there's several orders of magnitude of hardware overhang. This makes it easier for OpenAI to succeed, not harder, because it makes it less likely that they'll scale to their financial limits and still fail to surpass humans.

addicted · on Jan 28, 2025

The point is that this was developed outside of OpenAI.

So the real question is why does anyone believe that OpenAI will bring AGI when actual innovation was happening in some hedge fund in China while OpenAI was going on an international tour trying to drum up a trillion dollars.

FeepingCreature · on Jan 28, 2025

Okay, that argument makes no sense to me. I thought the whole point of VC is that money is cheaper than time to market? So OpenAI didn't microoptimize their training code, sure, but they didn't need to. All the innovation of R1 is that they managed to match OpenAI's tech demo from like a year ago using considerably worse hardware by microoptimizing the hell out of it. And that's cool, full credit to them, it's a mighty impressive model. But they did it like that because they had to. It's very impressive given their constraints, but it doesn't actually advance the field.

physicsguy · on Jan 28, 2025

The interesting part is that distillations based on reinforcement learning based models are performing so well. That brings the cost down dramatically to do certain tasks.

azinman2 · on Jan 28, 2025

I thought the distillations were SFT only?

physicsguy · on Jan 28, 2025

They're SFT on the chain of thought output of R1

elijahbenizzy · on Jan 18, 2025

This is one of those fun reads because it unifies quite a few things that I’ve read about or been interested in recently — Hilbert curves for geospatial indexing in dbs, Gray codes, and fractals! And it’s all fairly intuitive — the 1-bit shift makes sense for space traversal and makes the numbers curve pattern easier to reason about.

elijahbenizzy · on Jan 15, 2025

Not believable, didn't read "double click"

elijahbenizzy · on Dec 5, 2024

We've leveraged Diataxis heavily for Hamilton and Burr documentation:

- https://hamilton.dagworks.io

- https://burr.dagworks.io

It's not always the easiest to follow (we often have disagreements about whether something is a tutorial or a how-to), but it's a really valuable framing and I think our docs have gotten better because of it.

elijahbenizzy · on Nov 19, 2024

Links to various resources + writeups!

Recursion/parallelism: - Docs: https://burr.dagworks.io/pull/370/concepts/parallelism/

- Example: https://github.com/DAGWorks-Inc/burr/tree/main/examples/para...

UI annotations: - Blog post: https://blog.dagworks.io/p/annotating-data-in-burr

OpenTelemetry: - Docs: https://burr.dagworks.io/reference/integrations/opentelemetr...

- Blog post: https://blog.dagworks.io/p/trace-all-parts-of-your-agenticai

Time-travel/forking: - Docs: https://burr.dagworks.io/concepts/state-persistence/#initial...

- Blog post: https://blog.dagworks.io/p/travel-back-in-time-with-burr

Monitoring: - Deployment: https://github.com/DAGWorks-Inc/burr/tree/main/burr/tracking...

- UI overview: https://blog.dagworks.io/p/burr-ui

And a few other writeups we're excited about: - Collaboration with instructor: https://python.useinstructor.com/blog/2024/07/11/youtube-tra...

- Full-stack example of a streaming app: https://towardsdatascience.com/how-to-build-a-streaming-agen...

- Human-in-the-loop app with Burr: https://towardsdatascience.com/building-an-email-assistant-a...

elijahbenizzy · on Sept 3, 2024

Heh, this was very much the design philosophy behind Hamilton (github.com/dagworks-inc/hamilton).

The basic idea was that if you have a data artifact (columns for dataframes initially), you should be able to ctrl-f and find it in your codebase. 1:1 mapping of data -> function.

People take a long time to figure out that the readability gains from having greppability is worth whatever verbosity that comes, largely because they think of code too much as a craft (make it as small/neat as possible) and not documentation for a live process...