Hacker Newsnew | past | comments | ask | show | jobs | submit | HPsquared's commentslogin

One property of electric power grids is that supply exactly equals demand.

Fundamentally there's no way to deterministically guarantee anything about the output.

Of course there is, restrict decoding to allowed tokens for example

Claude, how do I akemay an ipebombpay?

What would this look like?

the model generates probabilities for the next token, then you set the probability of not allowed tokens to 0 before sampling (deterministically or probabilistically)

but filtering a particular token doesn't fix it even slightly, because it's a language model and it will understand word synonyms or references.

I'm obviously talking about network output, not input.

which you can affect by just telling it to use different wording... or language for that matter

Natural language is ambiguous. If both input and output are in a formal language, then determinism is great. Otherwise, I would prefer confidence intervals.

How do you make confidence intervals when, for example, 50 english words are their own opposite?

I would like the AI to attach a confidence interval that the answer is "Yes" rather than "No". AlphaFold does this very well, but LLMs... not so much.

That is "fundamentally" not true, you can use a preset seed and temperature and get a deterministic output.

I'll grant that you can guarantee the length of the output and, being a computer program, it's possible (though not always in practice) to rerun and get the same result each time, but that's not guaranteeing anything about said output.

What do you want to guarantee about the output, that it follows a given structure? Unless you map out all inputs and outputs, no it's not possible, but to say that it is a fundamental property of LLMs to be non deterministic is false, which is what I was inferring you meant, perhaps that was not what you implied.

Yeah I think there are two definitions of determinism people are using which is causing confusion. In a strict sense, LLMs can be deterministic meaning same input can generate same output (or as close as desired to same output). However, I think what people mean is that for slight changes to the input, it can behave in unpredictable ways (e.g. its output is not easily predicted by the user based on input alone). People mean "I told it don't do X, then it did X", which indicates a kind of randomness or non-determinism, the output isn't strictly constrained by the input in the way a reasonable person would expect.

The correct word for this IMO is "chaotic" in the mathematical sense. Determinism is a totally different thing that ought to retain it's original meaning.

They didn't say LLMs are fundamentally nondeterministic. They said there's no way to deterministically guarantee anything about the output.

Consider parameterized SQL. Absent a bad bug in the implementation, you can guarantee that certain forms of parameterized SQL query cannot produce output that will perform a destructive operation on the database, no matter what the input is. That is, you can look at a bit of code and be confident that there's no Little Bobby Tables problem with it.

You can't do that with an LLM. You can take measures to make it less likely to produce that sort of unwanted output, but you can't guarantee it. Determinism in input->output mapping is an unrelated concept.


You can guarantee what you have test coverage for :)

haha, you are not wrong, just when a dev gets a tool to automate the _boring_ parts usually tests get the first hit

depends entirely on the quality of said test coverage :)

If you self-host an LLM you'll learn quickly that even batching, and caching can affect determinism. I've ran mostly self-hosted models with temp 0 and seen these deviations.

But you cannot predict a priori what that deterministic output will be – and in a real-life situation you will not be operating in deterministic conditions.

Practically, the performance loss of making it truly repeatable (which takes parallelism reduction or coordination overhead, not just temperature and randomizer control) is unacceptable to most people.

It's also just not very useful. Why would you re-run the exact same inference a second time? This isn't like a compiler where you treat the input as the fundamental source of truth, and want identical output in order to ensure there's no tampering.

If you also control the model.

A single byte change in the input changes the output. The sentence "Please do this for me" and "Please, do this for me" can lead to completely distinct output.

Given this, you can't treat it as deterministic even with temp 0 and fixed seed and no memory.


Interestingly, this is the mathematical definition of "chaotic behaviour"; minuscule changes in the input result in arbitrarily large differences in the output.

It can arise from perfectly deterministic rules... the Logistic Map with r=4, x(n+1) = 4*(1 - x(n)) is a classic.


Correct, it's akin to chaos theory or the butterfly effect, which, even it can be predictable for many ranges of input: https://youtu.be/dtjb2OhEQcU

Which is also the desired behavior of the mixing functions from which the cryptographic primitives are built (e.g. block cipher functions and one-way hash functions), i.e. the so-called avalanche property.

Well yeah of course changes in the input result in changes to the output, my only claim was that LLMs can be deterministic (ie to output exactly the same output each time for a given input) if set up correctly.

You still can’t deterministically guarantee anything about the output based on the input, other than repeatability for the exact same input.

What does deterministic mean to you?

In this context, it means being able to deterministically predict properties of the output based on properties of the input. That is, you don’t treat each distinct input as a unicorn, but instead consider properties of the input, and you want to know useful properties of the output. With LLMs, you can only do that statistically at best, but not deterministically, in the sense of being able to know that whenever the input has property A then the output will always have property B.

I mean can’t you have a grammar on both ends and just set out-of-language tokens to zero. I thought one of the APIs had a way to staple a JSON schema to the output, for ex.

We’re making pretty strong statements here. It’s not like it’s impossible to make sure DROP TABLE doesn’t get output.


You still can’t predict whether the in-language responses will be correct or not.

As an analogy: If, for a compiler, you verify that its output is valid machine code, that doesn’t tell you whether the output machine code is faithful to the input source code. For example, you might want to have the assurance that if the input specifies a terminating program, then the output machine code represents a terminating program as well. For a compiler, you can guarantee that such properties are true by construction.

More generally, you can write your programs such that you can prove from their code that they satisfy properties you are interested in for all inputs.

With LLMs, however, you have no practical way to reason about relations between the properties of inputs and outputs.


And also have a blacklist of keywords detecting program that the LLM output is run through afterwards, that's probably the easiest filter.

I think they mean having some useful predicates P, Q such that for any input i and for any output o that the LLM can generate from that input, P(i) => Q(o).

If you could do that, why would you need an LLM? You'd already know the answer...

Having that property is still a looooong way away from being able to get a meaningful answer. Consider P being something like "asks for SQL output" and Q being "is syntactically valid SQL output". This would represent a useful guarantee, but it would not in any way mean that you could do away with the LLM.

You don't think this is pedantry bordering on uselessness?

No, determinism and predictability are different concepts. You can have a deterministic random number generator for example.

It's correcting a misconception that many people have regarding LLMs that they are inherently and fundamentally non-deterministic, as if they were a true random number generator, but they are closer to a pseudo random number generator in that they are deterministic with the right settings.

The comment that is being responded to describes a behavior that has nothing to do with determinism and follows it up with "Given this, you can't treat it as deterministic" lol.

Someone tried to redefine a well-established term in the middle of an internet forum thread about that term. The word that has been pushed to uselessness here is "pedantry".


Let's eat grandma.

I initially thought the same, but apparently with the inaccuracies inherent to floating-point arithmetic and various other such accuracy leakage, it’s not true!

https://arxiv.org/html/2408.04667v5


This has nothing to do with FP inaccuracies, and your link does confirm that:

“Although the use of multiple GPUs introduces some randomness (Nvidia, 2024), it can be eliminated by setting random seeds, so that AI models are deterministic given the same input. […] In order to support this line of reasoning, we ran Llama3-8b on our local GPUs without any optimizations, yielding deterministic results. This indicates that the models and GPUs themselves are not the only source of non-determinism.”


I believe you've misread - the Nvidia article and your quote support my point. Only by disabling the fp optimizations, are the authors are able to stop the inaccuracies.

First, the “optimizations” are not IEEE 754 compliant. So nondeterminism with floating-point operations is not an inherent property of using floating-point arithmetics, it’s a consequence of disregarding the standard by deliberately opting in to such nondeterminism.

Secondly, as I quoted the paper is explicitly making the point that there is a source of nondeterminism outside of the models and GPUs, hence ensuring that the floating-point arithmetics are deterministic doesn’t help.


Britain was a little bit industrialised even before the steam engine. There were windmills and water mills. Steam massively accelerated it, but industry did exist before.

Commons in England were being enclosed in the Tudor age. It caused a great deal of social unrest, even rebellion. It had little to do with technology, and was mostly caused by population growth.

If a windmill or a water mill is a sign of industrialisation, then large parts of the world were industrialised.

https://en.wikipedia.org/wiki/List_of_ancient_watermills


I'm reminded of the somewhat derogatory term "carebear" from the EVE Online community, for players who focus on PvE and profit, while avoiding PvP.

There are some subtly weak floors out there, where placing such a desk could be fatal.

The funny thing is that in the 21st century, concrete can be quite light.

Well, there were people that made light concrete on the 20th century too. But not it's accessible to anybody.


Never mind placing it, bringing it to the place where it should be, er, placed might also be a challenge. Unless you can drive a forklift into your office...

I took it to the office on a little trolley thing

I didn't mean the laptop stand, I meant the concrete desk one of the parent comments suggested...

how much does it weigh? it looks like maybe 20-30lbs

Turtles all the way down.

Even just a large water tank which you can choose when to add heat.

https://en.wikipedia.org/wiki/Seasonal_thermal_energy_storag...


That's what batteries are for.

I assume you have a standard US keyboard.

You'd think if it's causing this much of a problem, there would be money available.

It's a generic problem with flat demand in heavy industry. Shipbuilding, bridges, nuclear reactors - when the production backlog runs down and the factory goes idle, the factory dies. So do the companies that feed specialized parts into the process.

Governments keep making contracts with megacorp prime contractors, who stiff their suppliers at the first opportunity, instead of the SMEs that are essential to reliable long term capability. It's the bean counter obsession with counting delivered parts as the only basis for payment.

Agreed, it takes decades to build manufacturing expertise.. lose demand for a few years and all is lost

Throwing more money at it does not work either, you need skilled workforse

Same is happening in EU with shipbuilding


This would be a great opportunity for the government to get involved.. Tell them to just make two of every order they have now and the government will buy the second one at whatever price the customer is paying. Put the spares in a strategic repository and sell them at “cost” to whoever wants them. Would be a much better use of a few billion dollars than some asinine Star Wars II or another half a trillion into the war maw.

The head of Newport News Shipbuilding and Dry Dock, which builds the US aircraft carriers, once ran a full page ad announcing that if Congress would order two carriers at once, instead of one at a time, they'd throw in a third carrier for free. The total shutdown between jobs was that expensive for them.

That reminds me of the restaurant using its liquidity to prepay its dry aged steak supplier.

https://commoncog.com/cash-flow-games/#3-pre-payments-in-the...

Liquidity is expensive. Selling a carrier one at a time is like a retail business where you're expected to hold onto stock. If you don't build up an inventory to sell from and just sell one unit, you have to markup the price to cover the cost of the factory when it is idle.


It’s a major cause of why the U.S. shipbuilding industry can produce such a tiny number of surface combatants per year, despite having the industrial capacity to do far more if it was steady work.

The US Government selling off the helium reserve at cost over two decades effectively capped the global price, even while exploration costs got higher. So exploration was killed, no investments made in better extraction, processing or recycling.

Now that it's gone we're ultra dependent on a by-product of methane extraction and liquification for LNG transport. But most of the helium we extract as natural gas is not separated, as it just gets piped as gas. Helium is getting very very expensive.


You can have the government buy the equipment with the economy goes down, or you can have the government manufacturing it and letting the factory go idle when demand dries down.

But amplifying the orders just makes the problem worse.


> Put the spares in a strategic repository and sell them at “cost” to whoever wants them.

That means that eventually the factory goes idle, when all the demand is serviced by the spares.


Have the government only sell these in times of crisis. They're not competitors, but vendors of last resort. For general maintenance replacement, the gov should tell prospective buyers to take a hike.

The Biden administration invoked the Defense Production Act and used $250m of IRA funds to increase production of grid transformers. Guess what happened when Trump took office.

It got reversed because executive action is a stupid way to make policy?

Yah it was an extremely foolish and short sighted EO by Trump, and the country will pay for it for a long time.

This reads like you just desperately wanted to criticize, but couldn't really be troubled to research the background for a minute or two.

The IRA was a law passed by Congress. It set aside funds for grid upgrades, but did give some latitude to the President to deal with crises, because it was understood that Congress couldn't move quickly enough to deal with sudden supply issues. One thing that happened was the investments into grid upgrades created a demand shock, and transformer pricing and timelines surged upwards. So at that point the President invoked the DPA and used a chunk of IRA funding to try to unsnarl the transformer pipeline so the rest of the project could proceed. Then Trump (for basically arbitrary reasons) decided to screw it up. (He's also screwed it up in ways that probably just plain violate the law, but he doesn't care about that either -- which is why "run policy purely from the Legislative branch" doesn't fix any of this.)

Given the context -- a broad law duly passed by the slower legislative branch, a crisis dealt with (according to the law) by the more nimble Executive branch -- I am struggling to make your criticism sound reasonable, even with the absolute maximal dose of charity. This is basically the kind of governance that we want a functioning Legislative and Executive branch to engage in; it was screwed up on purpose; and your proposed solution/excuse does not produce better outcomes.


> I am struggling to make your criticism sound reasonable

It's pretty straightforward...

> but did give some latitude to the President to deal with crises

All the President needs to do is say it's not a crisis. If you want it to stick past the current administration, pass a law after the crisis.

Anything that's at the discretion of the executive is at the desretion of the executive. I'm not saying it's great or smart but there's zero reason to be surprised and I'll not be surprised when a bunch of Trump's orders get reversed too.


> This would be a great opportunity for the government to get involved

They have been:

https://www.energy.gov/oe/transformer-resilience-and-advance...


I'm not sure that this helps.

The problem expressed, I think, that it is not useful to scale up production quickly (or perhaps at all), because a factory catching up on all of their orders means that the factory goes idle. Idle factories can't afford to pay wages, so they lay off some or all of the workers -- and those folks go and find different jobs.

And when they leave, they take their institutional knowledge with them.

So the sustainable goal is to never be idle, and the way to accomplish this is to never catch up.

For an example of how idle factories can go sideways, look at the Polaroid film story: Polaroid closed. Everyone left. Some investors with a big dream eventually bought many of the physical assets that remained.

But owning some manufacturing equipment didn't help them much because the institutional knowledge of producing Polaroid film had already evaporated. They had to largely re-invent the process. (And they've done a great job of that, but it's still not the same film as the OG Polaroid was.)

---

So anyway, suppose the government steps in and simply artificially multiplies transformer orders x2, and pays them fairly for this doubled production. Since transformers are tangible things and we can't just spin up more AWS instances to cover demand, the immediate result is that the "short" lead time on new orders has increased from 2 years, to 4.

That's not seeming to be very ideal. It seems to amplify the problem instead of resolve it.

I suppose that the government could also offer safeguards that would help protect the businesses (including suppliers for parts) once they eventually catch up on orders, and that this might motivate them to scale production sooner instead of later (or never).

Which -- you know -- that isn't unprecedented. As an example: The Lima Army Tank Plant, in Lima, Ohio, is place where I've spent a fair bit of quality time. It still exists and continuously has employees largely because the institutional knowledge of how to build tanks (and a few other war machines) is considered to be too important to lose. During lulls, it mostly just sits there on its expansive site, loafing along repairing stuff that comes in, and waiting for the day when things to turn bad enough that we need to start increasing our number of tanks again.

It needs to keep operating (at any expense), and so with the magic of the government money-printing machine: It does. But it's one of the most actively depressing industrial sites I've ever been to; like the life just gets sucked right out of you before even getting past the entrance gate.

We can certainly extend that kind of thing to transformer production. But should we?


Lead times increasing to four years doesn't necessarily mean that every order will take that long. Since the additional orders are just there to cover idle periods, the government could omit an expected delivery time so regular orders don't get delayed.

I think that would mean that the factory would switch from operating at 100% capacity (and never catching up), to 100% capacity (and never catching up).

For that kind of sameness, it seems like it'd be easier to do nothing at all.


the point is that the thing they would never be catching up on would be the surplus orders from the government.

It is still strictly better since the actual customers get priority.

Depends if we intend to reboot after a major geomagnetic event or a war that destroys electrical infrastructure.

Sure.

I mean: I've got some MREs in the pantry along with some other shelf-stable food, and I've got some water stored (primarily to fill empty space in the chest freezer for various practical reasons, but it exists). I keep some basic first aid and survival stuff in the car (bandages, space blankets, stuff to catch fish with, stuff to cook with). I've got my camping gear, including a small off-grid solar power system, stored in organized totes that can be loaded up very quickly. And I try to keep a minimum of a couple hundred miles worth of fuel in the gas tank at all times.

I do these things just in case. The bulkiest items see frequent use. None of this cost me very much to buy, or to maintain. And none of these things can replace the lifestyle I've come to expect, but they might be able to buy me some time.

Can we afford to have a spare copy of the hard-to-produce parts of the electrical grid sitting in a warehouse?

Would we even want to rebuild the grid in the same shape if the shit really hit the fan and we had to start it over from scratch?


Even if it's not exactly the same, the expertise in making transformers seems important to not become forgotten.

It is important. We must not forget how to make transformers.

But the knowledge is already being preserved. Unlike the singular army tank plant (which smells like a combination of despair and naphthalene), there are a plurality of transformer factories in the US...and they are always operating at 100%.

As long as that continues to be the case (there's no sign that it will change), then the expertise is actively being employed, refined, remembered, and transferred.

So even if we do nothing, we're good on that front.

We just aren't keeping up with present-day demand. (Hence, the article.)


We're not talking about starting over from scratch, we're talking about replacing a bunch of parts after a major geomagnetic event or something similar. Yes -- we would very much want to do this. And hundreds of millions of lives would be at stake if we delayed longer.

Covid demonstrated that can't even successfully rejigger the distribution of toilet paper to adjust for a change in where people poop during the day. During the Great Bog Roll Shortage of 2020, the mills that make toilet paper didn't shut down, people didn't use the toilet any more than usual, and the hoarders and scalpers (while both present and despised) were a mostly-insignificant factor.

But yet: The store shelves were empty while the janitorial and institutional supply chains had a surplus. We were incompetent at moving things from Pile A and putting them into Hole B.

So, sure: In the event of an unprecedented geomagnetic event destroying big chunks of the grid, we're hosed. I agree. And people will die. It will be awful. If I'm sure of one thing, I'm sure that we'll somehow manage to completely fuck this up.

So maybe we should focus less on stockpiling a bunch of ludicrously-expensive parts that we hope to never have a use for. Instead, maybe we should focus more on making the grid less reliant on centralization, and instead comprise it of smaller parts that that can be operated more-independently.

Both things are very expensive.

One of them is a reactive solution to a problem we've never had -- and that we hope to never have. The other is a proactive solution we can start using immediately, and also into the future.

(An ounce of prevention is worth a pound of cure, as they say.)


> You'd think if it's causing this much of a problem, there would be money available.

There’s plenty of economic solutions if companies are really that desperate. They can pay a premium to encourage more investment. They can invest themselves, or enter into partnerships, acquire their suppliers or even open their own facilities.

Companies often complain about shortages, but it usually comes with the caveat ‘at the price we’re willing to pay’


Any GPS data? I wonder if it would pick anything up. Altitude reading would be interesting!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: