Step 0 sounds so easy. Until you realize __time__ exists. Then you take that away, and you find out that some compiler heuristics might not be deterministic.
Then you discover -frandom-seed - and go ballistic when you read "but it should be a different seed for each sourcefile"
Then you figure out your linker likes emitting a timestamp in object files. Then you discover /Brepro (if you're lucky enough to use lld-link.
Then you used to discover that Win7's app compat db expected a "real" timestamp, and a hash just won't do. (Thank God, that's dead now). This is usually the part where you start questioning your life choices.
Then somebody comes to your desk and asks if you can also make partial rebuilds deterministic.
On the upside, step 1 is usually quick, there will be no tests.
Though in the last 6 years I've seen at least one case where truly deterministic builds mattered:
A performance bug only happens when a malloc() during init was not aligned to 32 bytes, glibc on x86_64 only guaranteed 16 bytes, but depending on what alloc / dealloc happened before it may just land on 32 bytes boundary.
The alloc / dealloc sequence before that point was pretty deterministic, however there were a few strings containing __FILE__. And gitlab runner checked-out codes to a path with random number (or an index? I don't remember) without -ffile-prefix-map or $PWD trick so its length varies.
It is really nice to have determinatistic builds when doing estetic clean ups, to verify that the code does not change, or inspecting changes in the assembly code and limit the scope of change to just the affected code.
Often yes. Sometimes, no. You haven't enjoyed C++ until you get reports of the app intermittently crashing, and your build at the same version just won't.
But yes, if the goal is "slap it all in a container", that's probably good and at least somewhat reproducible. We aren't Python here! ;)
> Often yes. Sometimes, no. You haven't enjoyed C++ until you get reports of the app intermittently crashing, and your build at the same version just won't.
That's okay, it's probably just some bank in a random country that requires some software package to be installed, presumably in the interest of security, which injects a dll into every process on the machine and unsurprisingly has a bug which causes your process to crash at random in only that part of the world.
> some software package to be installed, presumably in the interest of security, which injects a dll into every process on the machine
You don't even have to get that far. Shell extensions (for file open or save dialogs) and printer drivers also introduce arbitrary DLLs to your processes. And some of them are compiled in an old version of IIRC Delphi or Turbo Pascal, which on the DLL startup code unconditionally changes the floating point control word to something which causes unexpected behavior in some framework you're using.
(We ended up wrapping all calls to file open or save or print dialogs with code to save and restore the floating point control word, just in case they had loaded one of these annoying DLLs.)
That's probably reading uninitialized memory. You can get away with that for a VERY long time, until you can't. See the earlier valgrind recommendation.
But that sort of report isn't a deep mystery, it's just a specific class of bug. Given the description, you've got a pretty good idea of what you're looking for.
... until the cause is really and truly a non-deterministic build. Trust me, been there.
For a long-ago example: I worked on a project that had an optimizer that used time-bounded simulated annealing to optimize. No two builds ever the same. It was "great".
yeah I don't think OP is talking about byte perfect determinism, they just want CI not to explode. that's the triage goal, byte perfect determinism is not your first priority when stopping the bleeding on a legacy c++ project
> Then somebody comes to your desk and asks if you can also make partial rebuilds deterministic.
This is a good guy. Knows what they need, knows you are smart enough to potentially finally slay the dragon, will fight the bureaucracy on your behalf. Asking a hard ask is rarely beneficial for the asker on the failure side. Don't burn yourself out for it though and don't be afraid to ask hard favors from the asker.
> The string can either be a number (decimal, octal or hex) or an arbitrary string (in which case it’s converted to a number by computing CRC32).
> The string should be different for every file you compile.
So basically just pass in the project relative path to the file into random-seed and you'll be fine. It's a shame the guidance doesn't explain why the string should be different because that feels like it could be advice that's not rooted in any technical reality.
__time__ isn't actually that bad as it's an anti-pattern and much better for the build system to inject the build time explicitly as an input macro (if your software needs it for UX purposes).
__FILE__ is the more annoying one but can be solved through fmacro-prefix-map.
Then you discover -frandom-seed - and go ballistic when you read "but it should be a different seed for each sourcefile"
Then you figure out your linker likes emitting a timestamp in object files. Then you discover /Brepro (if you're lucky enough to use lld-link.
Then you used to discover that Win7's app compat db expected a "real" timestamp, and a hash just won't do. (Thank God, that's dead now). This is usually the part where you start questioning your life choices.
Then somebody comes to your desk and asks if you can also make partial rebuilds deterministic.
On the upside, step 1 is usually quick, there will be no tests.