More

nathanrf · 2025-11-11T03:34:11 1762832051

GADTs let you do similar things in the simplest cases, they just don't work as well for complex constraints.

Like, append: Vector n t -> Vector m t -> Vector (n+m) t. With GADTs you need to manually lift (+) to work on the types, and then provide proofs that it's compatible with the value-level.

Then you often need singletons for the cases that are more complex- if I have a set of integers, and an int x, and I want to constrain that the item is in the set and is also even, the GADT approach looks like:

doThing :: Singleton Int x -> Singleton (Set Int) s -> IsInSet x s -> IsEven x -> SomeOutput

Which looks reasonable, until you remember that the Singleton type has to have a representation which is convenient to prove things about, not a representation that is convenient/efficient to operate on, which means your proofs will be annoying or your function itself will hardly resemble a "naive" untyped implementation.

Dependent types fix this because your values are "just" values, so it would be something like

doThing :: (x : int) -> (s : set int) -> so (is_in s x) -> so (even x) -> SomeOutput

Here, x and s are just plain values, not a weird wrapper, and is_in and event are ordinary (and arbitrary) library functions.

nathanrf · 2025-11-11T03:15:50 1762830950

In the style of the linked post, you'd probably define a generic type (well, one of two generic types):

type ExactlyStatic : (0 t: Type) -> (0 v: t) -> Type

type ExactlyRuntime : (0 t: Type) -> (v: t) -> Type

Then you could have the type (ExactlyStatic Int 9) or the type (ExactlyRuntime Int 9).

The difference is that ExactlyStatic Int 9 doesn't expose the value 9 to the runtime, so it would be fully erased, while (ExactlyRuntime Int 9) does.

This means that the runtime representation of the first would be (), and the second would be Int.

> Also how to handle runtime values? That will require type assertions, just like union types?

The compiler doesn't insert any kind of runtime checks that you aren't writing in your code. The difference is that now when you write e.g. `if condition(x) then ABC else DEF` inside of the two expressions, you can obtain a proof that condition(x) is true/false, and propagate this.

Value representations will typically be algebraic-data-type flavored (so, often a tagged union) but you can use erasure to get more efficient representations if needed.

nathanrf · 2025-11-08T19:46:28 1762631188

The two commands affect the same account balance, so they don't commute, so these commands conflict. Every EPaxos worker is required to be able to determine whether any two commands are conflicting, in this case it would be something like:

def do_commands_conflict(c1): return len(write(c1) & read(c2)) > 0 or len(write(c2) & read(c1)) > 0 or len(write(c1) & write(c2)) > 0

Whenever an EPaxos node learns about a new command, it compares it to the commands that it already knows about. If it conflicts with any current commands, then it gains a dependency on them (see Figure 3, "received PreAccept"). So the commands race; the first node to learn about both of them is going to determine the dependency order [in some cases, two nodes will disagree on the order that the conflicting commands were received -- this is what the "Slow Path" is for].

The clients don't coordinate this; the EPaxos nodes choose the order. The cluster as a whole guarantees linearity. This just means that there's at least one possible ordering of client requests that would produce the observed behavior; if two clients send requests concurrently, there's no guarantee of who goes first.

(in particular, the committed dependency graph is durable, even though it's arbitrary, so in the event of a failure/restart, all of the nodes will agree on the dependency graph, which means that they'll always apply non-commuting commands in the same order)

nathanrf · 2025-09-17T14:40:46 1758120046

This is the (pre-print) paper produced from that project - i.e. it is a self-contained, complete description of the work completed so that it can be reviewed, and then cited.

The underlying result has not changed, but the presentation of the details is now complete.

nathanrf · 2025-09-17T13:17:25 1758115045

Here's a high level overview for a programmer audience (I'm listed as an author but my contributions were fairly minor):

[See specifics of the pipeline in Table 3 of the linked paper]

* There are 181 million ish essentially different Turing Machines with 5 states, first these were enumerated exhaustively

* Then, each machine was run for about 100 million steps. Of the 181 million, about 25% of them halt within this memany step, including the Champion, which ran for 47,176,870 steps before halting.

* This leaves 140 million machines which run for a long time.

So the question is: do those TMs run FOREVER, or have we just not run them long enough?

The goal of the BB challenge project was to answer this question. There is no universal algorithm that works on all TMs, so instead a series of (semi-)deciders were built. Each decider takes a TM, and (based on some proven heuristic) classifies it as either "definitely runs forever" or "maybe halts".

Four deciders ended up being used:

* Loops: run the TM for a while, and if it re-enters a previously-seen configuration, it definitely has to loop forever. Around 90% of machines do this or halt, so covers most.

6.01 million TMs remain.

* NGram CPS: abstractly simulates each TM, tracking a set of binary "n-grams" that are allowed to appear on each side. Computes an over-approximation of reachable states. If none of those abstract states enter the halting transition, then the original machine cannot halt either.

Covers 6.005 million TMs. Around 7000 TMs remain.

* RepWL: Attempts to derive counting rules that describe TM configurations. The NGram model can't "count", so this catches many machines whose patterns depend on parity. Covers 6557 TMs.

There are only a few hundred TMs left.

* FAR: Attempt to describe each TM's state as a regex/FSM.

* WFAR: like FAR but adds weighted edges, which allows some non-regular languages (like matching parentheses) to be described

* Sporadics: around 13 machines had complicated behavior that none of the previous deciders worked for. So hand-written proofs (later translated into Rocq) were written for these machines.

All of the deciders were eventually written in Rocq, which means that they are coupled with a formally-verified proof that they actually work as intended ("if this function returns True, then the corresponding mathematical TM must actually halt").

Hence, all 5-states TMs have been formally classified as halting or non-halting. The longest running halter is therefore the champion- it was already suspected to be the champion, but this proves that there wasn't any longer-running 5-state TM.

tsterin · 2025-09-17T13:24:32 1758115472

In Coq-BB5 machines are not run for 100M steps but directly thrown in the pipeline of deciders. Most halting machines are detected by Loops using low parameters (max 4,100 steps) and only 183 machines are simulated up to 47M steps to deduce halting.

In legacy bbchallenge's seed database, machines were run for exactly 47,176,870 steps, and we were left with about 80M nonhalting candidates. Discrepancy comes from Coq-BB5 not excluding 0RB and other minor factors.

Also, "There are 181 million ish essentially different Turing Machines with 5 states", it is important to note that we end up with this number only after the proof is completed, knowing this number is as hard as solving BB(5).

cvoss · 2025-09-17T13:45:37 1758116737

Why is knowing that there are 181 M 5-state machines difficult? Is it not just (read*write*move*(state+halt))^state? About 48^5, modulo "essentially different".

tsterin · 2025-09-17T14:03:24 1758117804

Number of 5-state TMs is 21^10 = 16,679,880,978,201; coming from (1 + writemovestate)^2*state; difference with your formula is that in our model, halting is encoded using undefined transition.

"essentially different" is not a statically-checked property. It is discovered by the enumeration algorithm (Tree Normal Form, see Section 3 of https://arxiv.org/pdf/2509.12337); in particular, for each machine, the algorithm needs to know it if will ever reach an undefined transition because if it does it will visit the children of that machine (i.e. all ways to set that reached undefined transition).

Knowing if an undefined transition will be ever reached is not computable, hence knowing the number of enumerated machines is not computable in general and is as hard as solving BB(5).

schoen · 2025-09-17T14:06:18 1758117978

I guess the meaning of "essentially different" essentially depends on the strength of the mathematical theory that you used to classify them!

When I first heard it I thought about using some kind of similar symmetry arguments (e.g. swapping left-move and right-move). Maybe there are also more elaborate symmetry arguments of some kind.

Isn't it fair to say that there is no single objective definition of what differences between machines are "essential" here? If you have a stronger theory and stronger tools, you can draw more distinctions between TMs; with a weaker theory and weaker tools, you can draw fewer distinctions.

By analogy, suppose you were looking at groups. As you get a more sophisticated group theory you can understand more reasons or ways that two groups are "the same". I guess there is a natural limit of group isomorphism, but perhaps there are still other things where group structure or behavior is "the same" in some interesting or important ways even between non-isomorphic groups?

nickdrozd · 2025-09-17T14:52:28 1758120748

Turing machine program states are conventionally represented with letters: A, B, C, etc. The starting state is A.

Now suppose you are running a Turing machine program from the beginning. The only state it has visited so far is state A. It runs until it reaches a state that has not been visited yet. What state is it? B? C? D? According to "tree normal form", the name for that next state just is the earliest unused state name, which in this case is B.

Visited states so far are A and B. Run until an unvisited state is reached again. What is its name? C? D? E? Tree normal form says that the state will be called C. And the next newly visited state will be D, etc. In general, the canonical form for a Turing machine program will be the one that puts initial state visits in alphabetical order. (This concept also applies to the multi-color case.)

It's not possible to tell of an arbitrary TM program whether or not that program is in tree normal form. But this proof doesn't look at arbitrary programs. It generates candidate programs by tracing out the normal tree starting from the root, thereby bypassing non-normal programs altogether.

That is what "essentially different" means here.

tromp · 2025-09-17T15:10:28 1758121828

If you count arbitrary transition tables, then you get a count of 63403380965376 [1].

If you count transition tables in which states are reachable and canonically ordered, then you get a count of 632700 * 4^10 = 663434035200 [2]. These machines can behave differently on arbitrary tape contents.

TNF further reduces these counts by examining machine behaviour when starting on an empty tape.

[1] https://oeis.org/A052200

[2] https://oeis.org/A107668

dgacmu · 2025-09-17T15:21:43 1758122503

Just to directly address this - the key thing wrong in this formula is that it's ^2*state, not ^state. The state transition table operates both on the current state and the input you read from the tape, so for a binary turing machine with 5 states, you have a 10-entry transition table.

cvoss · 2025-09-17T17:21:12 1758129672

Ah, yes, of course the read bit should move into the exponent, since it's an input, not an output of the function. But the key point I was making is that there exists a formula. (I don't really care what the formula is.) The part I was not understanding was the complexity of "essentially different".

LegionMammal978 · 2025-09-17T17:34:41 1758130481

In this context, two syntactically-different TMs are considered "essentially the same" if all reachable states are the same up to reordering their labels (except for the fixed starting label A) and globally swapping the L/R directions.

The problem is knowing how many states are actually reachable, and how many are dead code. This is impossible to decide in general thanks to Rice's theorem and whatnot. In this case, it involves deciding all 4-state machines.

Lerc · 2025-09-17T14:26:19 1758119179

My intuition is that this would include every 4 state (and fewer) machine that has a fifth state that is never used. For every different configuration of that fifth state I would consider the machine to be essentially the same.

hackingonempty · 2025-09-17T14:26:39 1758119199

Thank you for the write-down, it is very interesting!

Is there some reason why Rocq is the proof assistant of choice and not one of the others?

also.... https://www.youtube.com/watch?v=c3sOuEv0E2I

nathanrf · 2025-09-17T14:37:22 1758119842

It is as simple as: the person who contributed the proofs/implementations chose Rocq.

I did some small proofs in Dafny for a few of the simpler deciders (most of which didn't end up being used).

At the time, I found Dafny's "program extraction" (i.e. transpilation to a "real" programming language) very cumbersome, and produced extremely inefficient code. I think since then, a much improved Rust target has been implemented.

LegionMammal978 · 2025-09-17T17:26:43 1758130003

Its focus around constructible objects does lend itself well to the kind of proofs needed for non-halting. Usually they involve lemmas about the 'syntax' of a TM's configuration and how it evolves over time, and the actual number-based math is relatively simple. (With the exception of Skelet #17, which involves fiddly invariants of finite sequences.)

AdamH12113 · 2025-09-17T14:57:25 1758121045

This is a fantastic summary that's very easy to understand. Thank you very much for writing it!

tsterin · 2025-09-18T08:38:17 1758184697

Thank you!

EvgeniyZh · 2025-09-17T17:22:42 1758129762

So how many sporadics would be left if we run the same on 6-state TM?

chtl · 2025-09-17T18:34:51 1758134091

Very many - I think the number is several thousand. Several sporadic 6-state machines have been solved, though, and there are currently about 2400(?) unsolved machines. Among these are several Cryptids, machines whose halting problem is known to be mathematically hard

jychang · 2025-09-18T10:37:11 1758191831

You guys need a blog to chase these down.

If nothing else, it’ll inspire the next generation of mathematicians.

pred_ · 2025-09-17T17:56:21 1758131781

Thanks for writing that up! Was there anything to be learned from those 13 sporadics or are they simply truly sporadic?

tsterin · 2025-09-18T08:39:57 1758184797

Thank you!

Yeah, one lesson is that if you relax your halting condition to entering a loop at some point you get much longer runtime, see machine Skelet#1 (https://bbchallenge.org/1LC1LE_---1LD_1RD0LD_1LA1RE_0LB0RC), enters a loop after 10^51 steps and the loop is 10^9 steps long.

generalizations · 2025-09-17T15:56:39 1758124599

> There is no universal algorithm that works on all TMs

Is this suspected or proven? I would love to read the proof if it exists.

mwenge · 2025-09-17T15:58:13 1758124693

It is indeed proven and the reason they're called Turing machines! https://en.wikipedia.org/wiki/Halting_problem

webstrand · 2025-09-17T16:55:14 1758128114

Doesn't the discovery of the fifth Busy Beaver value indicate that there is a decider for 5-state Turing machines?

tromp · 2025-09-17T17:03:35 1758128615

Yes, there are deciders for all finite sets of TMs. You just cannot have one for all TMs.

tialaramex · 2025-09-17T22:19:01 1758147541

I think actually for relatively small n we get cases where mathematics says nope, you can't decide that, the machine goes recursive and so now your decider may be looking at a machine which is itself running deciders and Kurt Gödel says "No".

webstrand · 2025-09-18T17:34:01 1758216841

Thanks for the hint to go looking some more. I found that Johannes Riebel has proven that BB(748) is undecidable. So for even small k there may not be deciders for them.

tialaramex · 2025-09-18T18:26:59 1758220019

The suspicion is that this happens maybe as early as BB(15). We just can't prove that whereas we can prove BB(745) is not decidable, and, of course, we've decided BB(5) as we see here.

orlp · 2025-09-17T17:05:18 1758128718

Yes. But there is no decider for n-state Turing machines that works regardless of n.

Y_Y · 2025-09-17T16:23:17 1758126197

Scooping the Loop Snooper (an elementary proof of the undecidability of the halting problem)

No program can say what another will do. Now, I won't just assert that, I'll prove it to you: I will prove that although you might work til you drop, you can't predict whether a program will stop.

Imagine we have a procedure called P that will snoop in the source code of programs to see there aren't infinite loops that go round and around; and P prints the word "Fine!" if no looping is found.

You feed in your code, and the input it needs, and then P takes them both and it studies and reads and computes whether things will all end as the should (as opposed to going loopy the way that they could).

Well, the truth is that P cannot possibly be, because if you wrote it and gave it to me, I could use it to set up a logical bind that would shatter your reason and scramble your mind.

Here's the trick I would use - and it's simple to do. I'd define a procedure - we'll name the thing Q - that would take and program and call P (of course!) to tell if it looped, by reading the source;

And if so, Q would simply print "Loop!" and then stop; but if no, Q would go right back to the top, and start off again, looping endlessly back, til the universe dies and is frozen and black.

And this program called Q wouldn't stay on the shelf; I would run it, and (fiendishly) feed it itself. What behaviour results when I do this with Q? When it reads its own source, just what will it do?

If P warns of loops, Q will print "Loop!" and quit; yet P is supposed to speak truly of it. So if Q's going to quit, then P should say, "Fine!" - which will make Q go back to its very first line!

No matter what P would have done, Q will scoop it: Q uses P's output to make P look stupid. If P gets things right then it lies in its tooth; and if it speaks falsely, it's telling the truth!

I've created a paradox, neat as can be - and simply by using your putative P. When you assumed P you stepped into a snare; Your assumptions have led you right into my lair.

So, how to escape from this logical mess? I don't have to tell you; I'm sure you can guess. By reductio, there cannot possibly be a procedure that acts like the mythical P.

You can never discover mechanical means for predicting the acts of computing machines. It's something that cannot be done. So we users must find our own bugs; our computers are losers!

by Geoffrey K. Pullum Stevenson College University of California Santa Cruz, CA 95064

From Mathematics Magazine, October 2000, pp 319-320.

gowld · 2025-09-17T17:40:16 1758130816

In principle, analysts could could analyze all machines of size K or less, classifying many of them as being halting or not, and all the rest could in fact be non-halting, but the analyst would never (at any finite time) know for sure whether they are all non-halting.

icepush · 2025-09-17T16:01:19 1758124879

This is proven. It's known as the halting problem and is the central pillar of computational complexity theory. The proof was invented by Alan Turing and is online.

tekne · 2025-09-17T15:59:27 1758124767

This is just the Halting Problem, many good introductory expositions of the proof exist.

_zoltan_ · 2025-09-17T18:10:39 1758132639

how much computing time does it take to evaluate a machine to 100 million steps on a modern epyc system?

seberino · 2025-09-17T14:38:39 1758119919

Thanks. Amazing work.

nathanrf · on March 11, 2025

I'm not involved in this rewrite, but I made some minor contributions a few years ago.

TSC doesn't use many union types, it's mostly OOP-ish down-casting or chains of if-statements.

One reason for this is I think performance; most objects are tagged by bitsets in order to pack more info about the object without needing additional allocations. But TypeScript can't really (ergonomically) represent this in the type system, so that means you don't get any real useful unions.

A lot of the objects are also secretly mutable (for caching/performance) which can make precise union types not very useful, since they can be easily invalidated by those mutations.

nathanrf · on June 25, 2024

The quoted paragraph is correct as written. E.g. if

f: ∀A, B. A -> B -> A

Then we have poly type π = "∀A, B. A -> B -> A".

The monotype μ = Str -> Int -> Str is a specialization of π, so we are also permitted to judge that f has type μ.

The type specialization operator "looks backwards", where we write sorta "π ≤ μ" because there's a kind of "subtyping" relationship (though most HM type systems don't actually contain subtyping) - anywhere that a term of type μ is used, a term of type π can be used instead because it can be specialized to μ.

I think this is a good complementary article: https://boxbase.org/entries/2018/mar/5/hindley-milner/ (it also links to three more academic sources for more context/detail)

nathanrf · on April 7, 2024

The paper defines them as programs in a process calculus (which is fairly standard as far as theory for protocols is involved):

  Definition 1 (Asserted protocols) Asserted protocols, or just protocols for short, are
  ranged over by S and are defined as the following syntax rules:
  S ::=
      |p.S                action prefix
      | +{ l_i : Si }_i∈I branching
      | µt.S              fixed-point
      | t                 recursive variable
      | end               end
      | assert(n).S       assert (produce)
      | require(n).S      require
      | consume(n).S      consume

Process calculi are "fundamental" descriptions of computation analogous to lambda calculus but oriented around communication instead of function calls. (As far as paper structure, I find that usually the important "basic" definitions in programming language research papers are usually in Section 2, since Section 1 serves as a high-level overview).

Basically, a protocol consists of a a sequence of sends/received on a particular channel, mixed with some explicit logic and loops/branches until you reach the end. There's some examples in Section 2.1 which are too complicated to reproduce here.

As a general note on reading protocols- for (good, but industry-programmer-unfriendly) technical reasons they're defined and written as "action1.action2.action3.rest_of_program" but mentally you can just rewrite this into

  {
    action1();
    action2();
    action3();
    ... rest_of_program ...
  }

(in particular, making "the rest of the program" part of each statement makes specifying scope much easier and clearer, which is why they don't just use semicolons in the first place)

hyperthesis · on April 7, 2024

Thanks for your guidance! I now see they're (now-obviously!) things like TCP and http. I had missed their informal definition:

> Here we use the term protocol to denote a specification of the interaction patterns between different system components. [...] To give a more concrete intuition, an informal specification of a protocol for an e-banking system may be as follows: The banking server repeatedly offers a menu with three options: (1) request a banking statement, which is sent back by the server, (2) request a payment, after which the client will send payment data, or (3) terminate the session.

I would say it's like how you use an API.

nathanrf · on April 1, 2024

Feral != Native. There are no honeybee species native to North America; all honeybees, including the feral ones, are descended from colonies imported from Europe (e.g. the "Western honeybee" apis mellifera).

There are a lot of non-honeybee bee species native to North America though, and they now face competition from feral domesticated honeybees. It's unclear exactly how much impact they have- some research does treat honeybees as a harmful invasive species (similar to many other human-introduced species).

Native bee species have a harder time getting good PR because they don't directly work for us, even though they are important pollinators for some native plants.

giantg2 · on April 1, 2024

I think there at least used to be honey producing bee species in the southern US. I believe some are extinct now and others are only present in South America now.

Baeocystin · on April 1, 2024

Mayan honeybees are native to the Yucatan, FWIW. They've also spread to Cuba.

mattmaroon · on April 1, 2024

South America has the stingerless variety and I’m very jealous.

nathanrf · on April 1, 2024

It is a good thing that cppfront lets you do that, then!

Cppfront generates #line pragmas which tell the generated .cpp file which source lines to "blame" for each piece of generated code. This isn't something new and fancy for cppfront, it's a bog-standard pragma that your debugger already understands. So it will work the exact same as your current debugging workflow even if you mix cpp and cpp2 source files.

wheybags · on April 1, 2024

I'm working on a hobby project language that generates plain C as output, and debugger integration has been one of my big worries. If that works, then this is awesome, thank you!