fcholf's comments

fcholf · 2026-01-15T22:36:52 1768516612

I agree, though, you can still work toward understanding using an LLM (and take it from a skeptical) by, e.g., using them as challengers to your ideas.

That said, I think it requires a lot of self discipline and should be complemented with other methods and sources of information to be useful. As a teacher, I really try to prevent my undergraduate students from taking the easy road of using LLMs to solve every easyish problem I give them to *learn*. Sure, they did the homework but most of them did not learn anything while doing it and they finish their first year without having learnt anything actionable regarding computer science (observe that I use a different approach with students from other areas, though I still think it is good to spend a few days without relying on LLMs).

I often use a sport analogy to land my point which works with them, so let me share it here. If you want to learn how to run a marathon and drive 42km every day, then you are certainly (hopefully ?) a better driver but nowhere near to running a marathon (fortunately, no one has yet challenged me with the fact that running a marathon is way less useful to get a job than driving).

(edit: grammar, spelling)

fcholf · 2025-10-28T14:51:17 1761663077

To efficiently encode XOR-constraint you will necessarily need extra variables (as you proposed) because we know that no bounded depth and polynomial size Boolean circuit can encode parity ("parity is not in AC0").

The problem with this kind of constraints is that they are not local and really disturb CDCL based SAT solvers because you cannot remove the constraint unless you have fixed all its variables. Now, cryptominisat has a specific focus on this kind of constraints and implements ad-hoc routines to exploit this routine. cryptominisat is hence focused toward applications where this kind of constraint naturally appears, be it circuit synthesis or cryptography.

Now, CDCL solver is another beast. Gaussian elimination / Grobner basis are not implemented in these solvers simply because it is "too costly". One way of understanding why is just to reframe what one call SAT solver. This is not really a tool to solve SAT (hell, they can't even deal with the pigeon hole principle without preprocessing), the purpose of a SAT solver is to quickly solve instances coming from natural encoding of constraint systems. In this framework, unit propagation and clause learning are basically extremely efficient because they uncover hidden structure in the way, us human, encode constraint problems. Hence (CDCL) SAT solvers are highly efficient in many applications. But it is easy to kill one with a few variables by simply going out of this framework. For example, they are killers in solving packages dependencies because the structure of these problems matches this approach, CDCL solvers are then just an extremely refined way of bruteforcing your way through the search space.

Now, if your application heavily contains XOR-constraints, then it is likely a (CDCL) SAT solver is not the right approach. For circuit synthesis, some people use BDD based approach (where parity can be efficiently encoded) but sometimes you will simply have to develop your dedicated solver or use other solvers for other NP-complete problem that will have a different area of specialization (as already observed in many comments).

js8 · 2025-10-28T16:10:49 1761667849

I appreciate the answer.

fcholf · 2025-10-28T14:36:37 1761662197

I would also add that #SAT solvers, aiming at counting the number of solutions, are often implicitly solving the ALL-SAT in a more efficient manner than what you would have with modified CDCL solvers because they use other caching and decomposability techniques. Check knowledge compiler d4 for example https://github.com/crillab/d4v2 that can build some kind of circuits representing every solution in a factorized yet tractable way.

fcholf · 2025-10-27T08:54:19 1761555259

Hi Alixander

Thanks for the good work!

I am late to the party, but let me add my grain of salt. I think animations could be a killer feature when preparing for talks. This is something I already do with a modified version of graphviz (I explain my workflow below) and d2 seems to have almost all functionalities I need to adapt it but one (the one I had to add to graphviz): hardcoding custom HTML attributes in the generated SVG for animation purposes.

Let me explain: I have been using revealjs for making slides for a while now. The way "animation" is dealt with here is really interesting: html blocks having the "fragment" class will be visited with a class "current-fragment" and "fragment-visited" one after the other. One can then play with the animation using CSS, eg, .fragment is not displayed but .fragment.current-fragment and .fragment.fragment-visited are displayed. Hence, moving through slides allows to display new content in the slides. Nothing prevents it to work on inlined svg and I can use d2 to create diagrams with nodes/edges having the fragment class and revealjs will incrementally make them appear.

I am however missing something here. Revealjs visits fragments by walking the DOM. Hence the animation depends on the order the diagram is generated in. Moreover, it is impossible to make several nodes appear together (or you need them in the same sub-diagram). Revealjs has a nice workaround: you can add a data-fragment-index attributes in your tag. Nodes will then be visited by following the fragment-index order. Being able to just add indices to the fragment in d2 would be enough to completely animate the diagram using revealjs without any overhead, by just inlining the generated svg into the slides.

I actually do that with graphviz already, where I slightly modified the source to that I can add a data-fragment-index in nodes/edges, which allows me to animate diagrams, see [1] for example for a recent animation I did for a talk.

I guess I could do this in d2 too, but since you are expressing an interest in animation, let me tell you that just being able to specify an order in which blocks are visited and that this is performed via css modification is an extremely powerful approach (you can not only make things appear/disappear, you can also change colors for highlighting or anything you can think of that is doable in css ; changing sizes is a bit of an adventure though as you may fall into inconsistent drawing). Moreover, if you use the fragment/data-fragment-index terminology, you would directly be compatible with revealjs, which could be a really nice combination (though, there are some shortcomings with revealjs regarding fragments index which is why I now use a homecooked [2], minimal version of the same idea ; like having more than one index for a block, or being forced to use numbers for index where 1a 1b 1c would be more interesting to "insert" transition between two existing ones without having to renumber).

Let me just conclude with an example. I would love to be able to write this in d2, to make the edge "CONNECT" appears and the "info" node at the first transition:

```

a: "Alice"

b: "Bob"

a -> b: "CONNECT" {

  class: "fragment",

  fragment-index: 1

}

info: "Alice and Bob are connected" {

  class: "fragment",

  fragment-index: 1

} ```

[1] https://florent.capelli.me/talks/251016-JITA/#/103

[2] https://github.com/fcapelli/monoski/

fcholf · on Dec 1, 2022

You may use xslt for this kind of operation (https://en.wikipedia.org/wiki/XSLT) but it is often overkilled. I would love a language that allows to easily write transductions of HTML documents but I think that combining simplicity and expressiveness for this kind of tasks is a bit painful. I usually end up using pandoc filters or some dedicated javascript script that transforms the DOM and output a new HTML file.

fcholf · on Oct 12, 2022

Actually, this has been done: https://arxiv.org/abs/1903.11391

fcholf · on Sept 27, 2022

> #P complete is at least as difficult as NP complete.

This is an euphemism :)! It is quite likely that #P is way harder than NP as witnessed by Toda's Theorem https://en.wikipedia.org/wiki/Toda%27s_theorem

fcholf · on Sept 27, 2022

Well this is true but CDCL SAT solvers do not materialize this BDD and they stop as soon as they find a satisfying assignment. If they do not find any satisfying assignment, they do not return the BDD as unsat proof but roughly the list of learnt clause. If these clauses have been learnt using vanilla CDCL solver technique, one can check from this that the formula is indeed unsat. See the (D)RAT proof format (check e.g., references listed for this tool https://www.cs.utexas.edu/~marijn/drat-trim/).

fcholf · on Sept 27, 2022

This is not the same goal as SAT. SAT looks for one satisfying assignment while OBDD tries to represent the full set of satisfying assignment in a factorized way to be analyzed later. For example, trying to count the number of satisfying assignment using a vanilla SAT-solver would be quite bad as you would end up generating every assignment while OBDD can sometimes take advantage of factorizing some part of the input.

But even if you are only interested in satisfiability, it sometimes happened that OBDD-based solvers are more efficient than CDCL SAT-solver. Indeed, for some application, you need to have richer constraints than the clauses used in CNF formulas. For example, for circuit synthesis, you often need to represent parity constraints (parity(x1...xn) is true iff there are an even number of values set to 1). CNF encoding of such constraints are expensive and kill most of the clever stuff that CDCL solvers do, while representing a parity constraint is actually quite easy with an OBDD of size 2n.

freemint · on Sept 27, 2022

There are also sat solvers which do (approximate) solution counting. It is just that all these pointers seems so cache unfriendly ... But thank you for your insight

dragontamer · on Sept 27, 2022

See the paper "The Number of Knight's Tours Equals 33,439,123,484,294 | Counting with Binary Decision Diagrams" by Martin Lobbing and Ingo Wegener. Bonus points, this occurred in the 80s, when computers only had kilobytes of memory (not even MBs). So... yeah, BDDs are an incredibly powerful technique.

BDDs obtain an exact count, and are among the most efficient algorithms for this problem.

"Counting" the solutions is closely related to "finding at least one solution". But they are fundamentally different. BDDs will be "less efficient at finding just one solution" compared to traditional SAT solvers. But SAT solvers are much less efficient at enumerating the entire solution space and/or finding exact counts.

fcholf · on Sept 27, 2022

I would be curious to know examples of SAT solvers you have in mind for approximate counting. The only tools I am aware of for approximate counting are dedicated to this task (and usually use SAT solvers as oracles under the hood).

fcholf · on Sept 27, 2022

Interesting to see it posted on HN. My favourite book on the subject is Branching Programs and Binary Decision Diagrams by Ingo Wegener: https://doi.org/10.1137/1.9780898719789.

There have been recent exciting developments of the same underlying idea for more complicated data structures (consisting in syntactic restrictions of Boolean circuits) with applications to other domains in computer science. This is known as "knowledge compilation" where the original goal is to transform a knowledge base offline to represent it to a more tractable data structure efficiently supporting "online" queries such as model counting and conditioning.

See for example some of the knowledge compilers listed here http://beyondnp.org/pages/solvers/knowledge-compilers/ for Boolean functions, applications to databases with so called factorized databases https://fdbresearch.github.io/index.html.

dragontamer · on Sept 27, 2022

The main issue with that book is that it talks about BDDs in general, as opposed to the most common ROBDD that people are probably interested in.

Its a lot of 'spinning up' to start from branching programs and working your way to ROBDDs. In many ways, ROBDDs are easier than a lot of the stuff discussed earlier in the book.

So the layout probably should be reworked. But otherwise, the info is all there, and its good that the book hits "rock bottom" so to speak, with regards to theory.

sitkack · on Sept 29, 2022

What do you both think of "The Calculus of Computation" by Bradley and Manna?

https://link.springer.com/book/10.1007/978-3-540-74113-8

Slides for the book are here, https://web.stanford.edu/class/cs156/slides/Technion/