I agree that this helps, although I still think that in general, the *default* b...

eschneider · on Feb 29, 2024

The scripted, packaged docker with toolchain dependencies and _is_ the build. If someone decides to use a different toolchain, the problems are on them.

jchw · on Feb 29, 2024

Yeah that works if you are not dealing with open source. If you are dealing with open source, though, it really won't save you that much trouble, if anything it will just lead to unnecessarily hostile interactions. You're not really obligated to fix any specific issues that people report, but shrugging and saying "Your problem." is just non-productive and harms valuable downstreams like Linux distributions. Especially when a lot of new failures actually do indicate bugs and portability issues.

saagarjha · on March 1, 2024

It doesn't even work outside of open source. I am running a prerelease toolchain almost all the time on my computer. If the project at work turns on -Werror, I immediately turn it off and store away the change. Of course this means that I send in code fixes for things that don't reproduce on other people's machines yet, but I literally never receive pushback for this.

aseipp · on March 1, 2024

Supporting every Linux distribution and their small differences isn't free, and Linux distributions shipping things you haven't tested directly is also a way for users to get bitten by bugs or bad interactions, which they will then report to you directly anyway so you're responsible for it. It's complicated. It's happened plenty of times where e.g. I've run into an obscure and bad bug caused by a packaging issue, or a downstream library that wasn't tested -- or there's a developer who has to get involved with a specific distro team to solve bugs their users are reporting directly to them but that they can't reproduce or pinpoint, because the distro is different from their own environment. Sometimes these point out serious issues, but other times it can be a huge squeeze to only get a little juice.

For some things the tradeoffs are much less clear, open-source or not e.g. a complex multi-platform GUI application. If you're going to ship a Flatpak to Linux users for example, then the utility of allowing any random build environments is not so clear; users will be running your produced binaries anyway. These are the minority of cases, though. (No, maybe not every user wants a Flatpak, but the developers also need to make decisions that balance many needs, and not everything will be perfect.)

Half of the problem, of course, is C and C++'s lack of consistent build environments/build systems/build tooling, but that's a conversation for another day.

That said, I generally agree with you that if you want to be a Good Citizen in the general realm of open-source C and C++ code, you should not use -Werror by default, and you should try (to whatever reasonable extent) to allow and support dependencies your users have. And try to support sanitizers, custom CFLAGS/CXXFLAGS, allow PREFIX and DESTDIR installation options, obey the FHS, etc etc. A lot of things have consolidated in the Linux world over the years, so this isn't as bad as it used to be -- and sometimes really does find legitimate issues in your code, or even issues in other projects.

jchw · on March 1, 2024

Again, you don't have to fix bugs that are reported, but treating it as invalid to use any compiler versions except for the exact ones that you use is just counterproductive.

The "utility" of allowing "any random build environment" is that those random build environments are the ones that exist on your user's computers, and absent a particularly good reason why it shouldn't work (like, your compiler is too old, or literally broken,) for the most part it should, and usually, it's not even that hard to make it work. Adopting practices like defaulting -Werror -Wall on and closing bugs as WONTFIX INVALID because it's not any of the blessed toolchains gains you... not sure. I guess piece of mind from having less open issues and one less flag in your CI? But it is sure to be very annoying to users who have fairly standard setups and are trying to build your software; it's pretty standard behavior to report your build failures upstream, because again, usually it does actually signal something wrong somewhere.

Developers are free to do whatever they want when releasing open source code. That doesn't mean that what they are doing is good or makes any sense. There are plenty of perfectly legal things that are utterly stupid to do, like that utterly bizarre spat between Home Assistant and NixOS.

nicoburns · on March 1, 2024

C++ is super annoying in this way. Many other languages (e.g Rust) only have one compiler and good portability out of the box which completely avoids this problem. And other ecosystems that do have multiple implementation (e.g. JavaScript) seem to have much better compatibility/interop such that it's not typically a problem you have to spend much if any time on in practice.

jxramos · on March 1, 2024

I'm curious what sort of CPUs and OSes do those languages run on. C++ runs on all sorts of obscure real time OSes, all the standard mainstream ones as well as on embedded equipment and various CPUs, but a lot of that is possible because of the variety of compilers.

adastra22 · on March 1, 2024

I’ve had rust projects with strict clippy rules break when rustc is upgraded.

michaelmior · on March 1, 2024

I would say it's still worth having -Werror for some "official" CI build even if it is disabled by default.

dotnet00 · on March 1, 2024

Open source projects that insist their docker container is the only way to go are going to be an instant reject from me. It's a total copout to just push a docker container and insist that anyone not using it is on their own.

Docker is too fraught with issues for that, and as anyone can attest, there are few things more frustrating in computing than having to follow down a chain of chasing issues in things only superficially related to what you actually want to do.

The least that can be done is for the project to do its best to not be dependent on specific versions, and explicitly document, in a visible place, the minimum and maximum versions known to work, along with a last changed date.

galangalalgol · on Feb 29, 2024

I would do wall wextra and werror. Again mostly for my own sanity. But I'd wait to add werror until they were all fixed so regression testing would continue as the warnings got fixed. Cpp_check and clang tidy would also eventually halt the pipeline. And *san on the tests as compiled in both debug and O3 with a couple compilers.

pizlonator · on March 1, 2024

I think this depends on a bunch of stuff.

- Who are the consumers of the source code, i.e. who will ever check it out and build it? Sometimes, it's just one person. Sometimes, it's a team of engineers. In that case, -W -Werror is fine.

- How does a warning being reported make the engineers on the team feel? If the answer is, "Hold my beer for five minutes while I commit a fix", then -W -Werror might be the right call. I've been on projects like that and some of them had nontrivial source code consumers.

- How easy is it to hack the build system? Some projects have wonderfully laid out build systems. If that's the case and -W -Werror is the default, then it's not hard to go in there and change the default, if the -Werror creates problems.

- Does the project have a facility (in the build system) and policy (as a matter of process) to just simply add -Wno-blah-blah as the immediate fix for any new warning that arises? I've seen that, too.

(I'm using -Werror in some parts of a personal project. If you're a solo maintainer of a codebase that can be built that way, then it's worth it - IMO much lower cognitive load to never have non-error warnings. The choice of what to do when the compiler complains is a more straightforward choice.)

account42 · on March 6, 2024

I don't really see the point of -Werror for projects where I am the only developer because I can just fix warnings before committing but things like errring on unused variables are counterprodutctive when you are just trying something out in your local checkout. In my opinion the only place -Werror makes sense is in CI - and there you can just as well have the CI fail on warnings so you get all of them in the output and not just miss out later ones that were never produced because the build was aborted due to -Werror. CI also allows more nuanced approaches like not allowing new warnings of some types while you still deal with fixing existing ones.

charcircuit · on March 1, 2024

Changing other dependencies can also cause the build to break. The best thing to do is to use the dependencies the project specifies.

jchw · on March 1, 2024

Technically changing literally anything, including the processor microarchitecture that the developer originally tested the code on, could easily cause a real-world breakage. That doesn't mean it should, though.

Most libraries not written by Google have some kind of backwards compatibility policy. This is for good reasons. For example, if Debian updates libpng because there's a new RCE, it's ideal if they can update every package to the same new version of libpng all at once. If we go to the extreme of "exact dependencies for every package", then this would actually mean that you have to update every dependent package to a new release that has the new version of libpng, all at the same time, across all supported versions of the distribution. Not to mention, imagine the number of duplicate libraries. Many Linux distros, including Debian, have adopted a policy of only having one version of any given library across the whole repo. As far as I understand, that even includes banning statically linked copies, requiring potentially invasive patching to make sure that downstream packages use the dynamically linked system version. And trust me, if they want to do this, they *will* do this. If they can do it for Chromium, they sure as hell can do it for literally any package.

There's a balance, of course. If a distro does invasive patching and it is problematic, I think most people will be reasonable about it and accept that they need to report the issue to their distribution instead. Distros generally do accept bugs for the packages that they manage, and honestly for most packages, by the time a bug gets to you, there is a pretty reasonable chance that it's actually a valid issue, so throwing away the issue simply because it came from someone running an "unofficial" build seems really counterproductive and definitely not in the spirit of open source.

Reproducibility is good for many reasons. I do not feel it is a good excuse to just throw away potentially valid bug reports though. It's not that maintainers are under any obligation to actually act on bug reports, or for that matter, even accept them at all in the first place, but if you do accept bugs, I think that "this is broken in new version of Clang" is a very good and useful bug report that likely signals a problem.

charcircuit · on March 1, 2024

>For example, if Debian updates libpng because there's a new RCE, it's ideal if they can update every package to the same new version of libpng all at once.

It Debian is upgrading a dependency instead of a developer, then Debian should be ready to fix any bugs they introduce.

>then this would actually mean that you have to update every dependent package to a new release that has the new version of libpng, all at the same time, across all supported versions of the distribution

This is already how it works. All vulnerable programs make an update and try to hold off in releasing it until near an embargo date. You don't have to literally update them all at the same time. It's okay of some are updated at different times than others.

>Not to mention, imagine the number of duplicate libraries.

Duplicate libraries are not an issue.

>Many Linux distros, including Debian, have adopted a policy of only having one version of any given library across the whole repo.

This is a ridiculous policy to me as you are forcing programs to use dependencies they were not designed for. This is something that should be avoided as much as possible.

>by the time a bug gets to you, there is a pretty reasonable chance that it's actually a valid issue

That doesn't mean there isn't damage done. There are many people who consider kdenlive an unstable program that constantly crashes because of distros shipping it with the incorrect dependencies. This creates reputational damage.

jchw · on March 1, 2024

> It Debian is upgrading a dependency instead of a developer, then Debian should be ready to fix any bugs they introduce.

That's what the Debian Bug Tracking System is for. However, if the package is actually broken, and it's because e.g. it uses the dependency improperly and broke because the update broke a bad assumption, then it would ideally be reported upstream.

> This is already how it works. All vulnerable programs make an update and try to hold off in releasing it until near an embargo date. You don't have to literally update them all at the same time. It's okay of some are updated at different times than others.

That's not how it works in the vast majority of Linux distributions, for many reasons, such as the common rule of having only one version, or the fact that Debian probably does not want to update Blender to a new major version because libpng bumped. That would just turn all supported branches of Debian effectively into a rolling release distro.

> Duplicate libraries are not an issue.

In your opinion, anyway. I don't really think that there's one way of thinking about this, but duplicate libraries certainly are an issue, whether you choose to address them or not.

> This is a ridiculous policy to me as you are forcing programs to use dependencies they were not designed for. This is something that should be avoided as much as possible.

Honestly, this whole tangent is pointless. Distributions like Debian have been operating like this for like 20+ years. It's dramatically too late to argue about it now, but if you're going to, this is not exactly the strongest argument.

Based on this logic, effectively programs are apparently usually designed for exactly one specific code snapshot in time of each of its dependencies.

So let's say I want to depend on two libraries, and both of them eventually depend on two different but compatible versions of a library, and only one of them can be loaded into the process space. Is this a made-up problem? No, this exact thing happens constantly, for example with libwayland.

Of course you can just pick any newer version of libwayland and it works absolutely perfectly fine, because that's why we have shared libraries and semver to begin with. We solved this problem absolutely eons ago. The solution isn't perfect, but it's not a shocking new thing, it's been the status quo for as long as I've been using Linux!

> That doesn't mean there isn't damage done. There are many people who consider kdenlive an unstable program that constantly crashes because of distros shipping it with the incorrect dependencies. This creates reputational damage.

If you want your software to work better on Linux distributions, you could always decide to take supporting them more seriously. If your program is segfaulting because of slightly different library versions, this is a serious problem. Note that Chromium is a vastly larger piece of software than Kdenlive, packaged downstream by many Linux distributions using this very same policy, and yet it is quite stable.

For particularly complex and large programs, at some point it becomes a matter of, OK, it's literally just going to crash sometimes, even if distributions don't package unintended versions of packages, how do we make it better? There are tons of avenues for this, like improving crash recovery, introducing fault isolation, and simply, being more defensive when calling into third party libraries in the first place (e.g. against unexpected output.)

Maintainers, of course, are free to complain about this situation, mark bugs as WONTFIX INVALID, whatever they want really, but it won't fix their problem. If you don't want downstreams, then fine: don't release open source code. If you don't want people to build your software outside of your exact specification because it might damage its reputation, then simply do not release code whose license is literally for the primary purpose of making what Linux distributions do possible. You of course give up access to copyleft code, and that's intended. That's the system working as intended.

I believe that ultimately releasing open source code does indeed not obligate you as a maintainer to do anything at all. You can do all manner of things, foul or otherwise, as you please. However, note that this relationship is mutual. When you release open source code, you relinquish yourself of liability and warranty, but you grant everyone else the right to modify, use and share that code under the terms of the license. Nowhere in the license does it say you can't modify it in specific ways that might damage your program's reputation, or even yours.

charcircuit · on March 1, 2024

>That's what the Debian Bug Tracking System is for.

Software should be extensively tested and code review should be done before it gets shipped to users. Most users don't know about the Debian Bug Tracking system, but they do know about upstream.

>Honestly, this whole tangent is pointless. Distributions like Debian have been operating like this for like 20+ years. It's dramatically too late to argue about it now, but if you're going to, this is not exactly the strongest argument.

It's not too late as evidence by the growth of solutions like appimage and flatpak which allows developers to avoid this.

>So let's say I want to depend on two libraries, and both of them eventually depend on two different but compatible versions of a library, and only one of them can be loaded into the process space. Is this a made-up problem? No, this exact thing happens constantly, for example with libwayland.

Multiple versions of a library can be loaded into the same address space. Developers can choose to have their libraries support a range of versions.

>that's why we have shared libraries and semver to begin with

Hyrum's Law. Semver doesn't prevent breakages on minor bumps.

jchw · on March 1, 2024

> Software should be extensively tested and code review should be done before it gets shipped to users.

That's why distributions have multiple branches. Debian Unstable packages get promoted to Debian Testing, which get promoted to a stable Debian release. Distributions do bug tracking and testing.

> Most users don't know about the Debian Bug Tracking system, but they do know about upstream.

There are over 80,000 bugs in the Debian bug tracker. There are over 144,000 bugs in the Ubuntu bug tracker. It would suffice to say that a lot of users indeed know about upstream bug trackers.

I am not blaming anyone who did not know this. It's fully understandable. (And if you ask your users to please go report bugs to their distribution, I think most distributions will absolutely not blame you or get mad at you. I've seen it happen plenty of times.) But just FYI, this is literally one of the main reasons distributions exist in the first place. Most people do not want to be in charge of release engineering for an entire system's worth of packages. All distributions, Debian, Ubuntu, Arch, NixOS, etc. wind up needing THOUSANDS of at least temporarily downstream patches to make a system usable, because the programs and libraries in isolation are not designed for any specific distribution. Like, many of them don't have an exact build environment or runtime environment.

Flatpak solves this, right? Well yes, but actually no. When you target Flatpak, you pick a runtime. You don't get to decide the version of every library in the runtime unless you actually build your own runtime from scratch, which is actually ill-advised in most cases, since it's essentially just making a Linux distribution. And yeah. That's the thing about those Flatpak runtimes. They're effectively, Linux distributions!

So it's nice that Flatpak provides reproducibility, but it's absolutely the same concept as just testing your program on a distro's stable branch. Stable branches pretty much only apply security updates, so while it's not bit-for-bit reproducible, it's not very different in practice; Ubuntu Pro will flat out just default to automatically applying security updates for you, because the risk is essentially nil.

> It's not too late as evidence by the growth of solutions like appimage and flatpak which allows developers to avoid this.

That's not what AppImage is for, AppImage is just meant to bring portable binaries to Linux. It is about developers being able to package their application into a single file, and then users being able to use that on whatever distribution they want. Flatpak is the same.

AppImage and Flatpak don't replace Linux distribution packaging, mainly because they literally can not. For one thing, apps still have interdependencies even if you containerize them. For another, neither AppImage nor Flatpak solve the problem of providing the base operating system for which they run under, both are pretty squarely aimed at providing a distribution platform specifically for applications the user would install. The distribution inevitably still has to do a lot of packaging of C and C++ projects no matter what happens.

I do not find AppImage or Flatpak to be bad developments, but they are not in the business of replacing distribution packaging. What it's doing instead is introducing multiple tiers of packaging. However, for now, both distribution methods are limited and not going to be desirable in all cases. A good example is something like OBS plugins. I'm sure Flatpak either has or will provide solutions for plugins, but today, plugins are very awkward for containerized applications.

> Multiple versions of a library can be loaded into the same address space. Developers can choose to have their libraries support a range of versions.

Sorry, but this is not necessarily correct. Some libraries can be loaded into the address space multiple times, however, this is not often the case for libraries that are not reentrant. For example, if your library has internal state that maintains a global connection pool, passing handles from one instance of the library to the other will not work. I use libwayland as an example because this is exactly what you do when you want to initialize a graphics context on a Wayland surface!

With static linking, this is complicated too. Your program only has one symbol table. If you try to statically link e.g. multiple versions of SDL, you will quickly find that the two versions will in fact conflict.

Dynamic linking makes it better, right? Well, not easily. We're talking about Linux, so we're talking about ELF platforms. The funny thing about ELF platforms is that the way the linker works, there is a global symbol table and the default behavior you get is that symbols are resolved globally and libraries load in a certain order. This behavior is good in some cases as it is how libpthreads replaces libc functionality to be thread-safe, in addition to implementing the pthreads APIs. However it's bad if you want multiple versions, as instead you will get mostly one version of a library. In some catastrophic cases, like having both GTK+2 and GTK3 in the same address space, it will just crash as you call a GTK+2 symbol that tries to access other symbols and winds up hitting a GTK3 symbol instead of what it expected. You CAN resolve this, but that's the most hilarious part: The only obvious way to fix this, to my knowledge, is to compile your dependencies with different flags, namely -Bsymbolic (iirc), and it may or may not even compile with these settings; they're likely to be unsupported by your dependencies, ironically. (Though maybe they would accept bug reports about it.) The only other way to do this that I am aware of is to replace the shared library calls with dlopen with RTLD_LOCAL. Neither of these options are ideal though, because they require invasive changes: in the former, in your dependencies, in the latter, in your own program. I could be missing something obvious, but this is my understanding!

> Hyrum's Law. Semver doesn't prevent breakages on minor bumps.

Hyrum's law describes buggy code that either accidentally or intentionally violates contracts to depend on implementation details. Thankfully, people will, for free, report these bugs to you. It's legitimately a service, because chances are you will have to deal with these problems eventually, and "as soon as possible" is a great time.

Just leaving your dependencies out of date and not testing against newer versions ever will lead to ossification, especially if you continue to build more code on top of other flawed code.

Hyrum's law does not state that it is good that people depend on implementation details. It just states that people will. Also, it's not really true in practice, in the sense that not all implementation details will actually wind up being depended on. It's true in spherical cow land, but taking it to its "theoretical" extreme implies infinite time and infinite users. In the real world, libraries like SDL2 make potentially breaking changes all the time that never break anything. But even when people do experience breakages as a result of a change, sometimes it's good. Sometimes these breakages reveal actual bugs that were causing silent problems before they turned into loud problems. This is especially true for memory issues, but it's even true for other issues. For example, a change to the Go programming language recently fixed a lot of accidental bugs and broke, as far as anyone can tell, absolutely nobody. But it did lead to "breakages" anyways, in the form of code that used to be accidentally not actually doing the work it was intended to do, and now it is, and it turns out that code was broken the whole time. (The specific change is the semantics change to for loop binding, and I believe the example was a test suite that accidentally wasn't running most of the tests.)

Hyrum's law also describes a phenomena that happens to libraries being depended on. For you as a user, you should want to avoid invoking Hyrum's law because it makes your life harder. And of course, statistically, even if you treat it as fact, the odds that your software will break due to an unintended API breakage is relatively low; it's just higher that across an entire distribution's worth of software something will go wrong. But for your libraries, they actually know that this problem exists and do their best to make it hard to rely on things outside the contract. Good C libraries use opaque pointers and carefully constrain the input domain on each of their APIs to try to expose as little unintended API surface area as humanly possible. This is a good thing, because again, Hyrum's law is an undesirable consequence!

jchw · on March 3, 2024

Update for posterity: Actually, Flatpak does have a solution for plugins, and they even explicitly use OBS as an example! Unfortunately, a lot of information around the web suggests that there are only a couple of plugins available as Flatpak extensions, but actually nowadays it appears there are in fact tons[1]. Very cool! Another one off the list.

[1]: https://flathub.org/apps/com.obsproject.Studio (go to Add-ons, click "More")