> In C89, undefined behavior is interpreted as, “The C standard doesn’t have requirements for the behavior, so you must define what the behavior is in your implementation, and there are a few permissible options”.
That's a serious misinterpretation. C89 says that "Permissible undefined behavior" includes "ignoring the situation completely with unpredictable results". There is absolutely no requirement to document what those "unpredictable results" might be.
The standard joke is that one possible consequence of undefined behavior is making demons fly out of your nose -- not because that's actually possible, but because actually making demons fly out of your nose would not violate the standard. That's equally true in C89 and all later versions of C. The change in C99 from "Permissible" to "Possible" made no difference. The phrase "imposes no requirements" has always meant exactly that. (And if you think the change from "Permissible" to "Posisble" is semantically significant, then it would have been a recognition of what was already accepted.)
I came here to say something similar. The author says:
> In C89, undefined behavior is interpreted as, “The C standard doesn’t have requirements for the behavior, so you must define what the behavior is in your implementation, and there are a few permissible options”.
This is wrong. That is the meaning of what C89 calls “implementation-defined” behavior. The implementation gets a choice; the standard may lay out what the choices are, and in any event the choice must be documented.
But with C89's “undefined” behavior, all bets are off. Numerous examples are documented of cases where real compilers were forced to behave in utterly bizarre ways because of real hardware limitations.
The C standard is quite liberal in its use of undefined behaviour.
Some cases are due to conflicting existing implementation that predated the standard. (Though implementation defined might make more sense?) Some are for performance reasons.
And some are utterly bizarre, like some things that should be syntax errors instead being declared undefined. If memory serves right, that includes eg not closing your string literals. (Thankfully, all implementations I know give you an error message.)
I have two responses to that. The first is that I think your specific example is mistaken. The C89 standard mandates a diagnostic message “for every translation unit that contains a violation of any syntax rule or constraint” (2.1.1.3). It also specifies that a string literal is a sequence of “any member of the source character set except the double-quote ", backslash \, or new-line character” (3.1.4). So my reading is that an attempt to include a newline in a string literal is a violation of a syntax rule that requires a diagnostic, and is therefore not undefined behavior. But I am far from an expert, and I would be glad if you could correct me if I have misunderstood.
But my second point is, the people who wrote the C89 standard were really smart. They were compiler writers from many areas of research and industry and worked very hard to find a workable compromise between all of their conflicting needs. It has often happened that I've seen someone say something like “an unclosed string shouldn't be undefined behavior, it should be a syntax error”, and then someone from the committee would show up and say “we wanted to make it a syntax error, but we were supposed to codify existing practice. Widely-Used Compiler X didn't diagnose that, and in fact couldn't, because of …”. And the reason would be something I couldn't have thought of. So I have learned to give them the benefit of the doubt, because they know way more about it than I ever have or will.
Oh, I don't doubt that the standard authors were smart, and worked within the constraints that were prevalent at the time. Their justified genesis doesn't make the outcomes less bizarre.
(And especially for C++, most of the later features are essentially workaround for bad ideas they had earlier.)
For the running example about unclosed string literals, I wonder why they didn't make it at least implementation defined behaviour instead?
The standards also seem a bit split in their purpose: on the one hand, they often try to codify existing behaviour. On the other hand, they often introduce new features that take compilers years to implement. (That probably applies more to C++ than C.)
With the benefit of hindsight, lots of problems could have been avoided if C had come with a module system a bit more sophisticated than automated copy-and-paste of include files via the preprocessor.
GCC has lots of flags to give you slightly different dialects of C. The Linux kernel for example tells GCC to never elide null pointer checks and to let signed integers overflow.
Haskell also has different dialects, but in addition to compiler flags, you can also specify which variant of the language you are using via pragmas at the top of your file. In most cases, you can mix and match modules written in different dialects; because they get translated to a more stable intermediate representation before they are combined.
C doesn't have that luxury with its include files.
For C++, there's a long running proposal to add modules to the language. But from all I've read, thanks to all the other complicated features in the language, modules are unlikely to work well for C++. (I'm mostly basing that on https://vector-of-bool.github.io/2019/01/27/modules-doa.html )
Enough ranting. Summary: I agree that the standard authors make the best effort given the situation. Doesn't make the languages more sensible, though.
Making syntax errors UB instead of specifying that the program must refuse to compile may seem unfortunate, but it does have some justification: it allows implementations to add extensions to the language and still claim compliance with the standard. If the standard mandated that any syntax not allowed by the standard must cause a compilation error, any compiler adding extensions to C syntax would be in breach of the standard.
The standard could mark that behaviour as implementation defined instead of undefined.
But it's really a non-issue: compilers like GCC don't claim to be standards compliant in all modes and with any combination of options. They are happy enough to have some combination of command line options that make them behave according to a specified standard.
No, I don't. I think that extensions are extensions which by definition are not part of the standard.
Compilers should allow extensions, but the standard does not necessarily have to. I do not say it should be done this way, but it is perfectly possible to define a strict standard and leave extensions out of it.
And that's what's happening in practice for some features.
GCC has a ton of options, and only a few of the myriad combinations give you a compiler that behaves strictly according to one of the C standards. And that's just fine.
In fact, compiler vendors experimenting, even in ways that are not allowed by the standard, is one of the main avenues to come up with new ideas for how to evolve the standard.
In fact, minor variations on semantics that are not allowed under the standard are probably more in spirit with C, than eg feeding the whole source file to a Python interpreter, if there's any string literal anywhere in the file that's not closed on the same line.
The former violates the standard, the latter complies.
The only sense in which an unterminated string literal can be UB is like this.
It's possible that if a string literal is not terminated, the situation will cause a de facto overly long string literal token to exist. For instance:
const char *str = "short str; int x = 42; and now a realy long line follows ...
if the non-termination causes a de facto large literal to exist in the program, and that large literal exceeds a minimum implementation limit, then that is UB. Probably.
Strictly speaking, since the token is not closed, it is not a string literal, and so any limitation on string literals doesn't apply to it. Unless we interpret that limit as pertaining to the implementation's tokenizer, as such, which can plausibly choke on simply the valid prefix of a string literal that is too long.
Anyway, if a newline occurs before the minimum limit on string literal length is reached, and that newline is not escaped with a backslash, then that is a straightforward syntax error.
> In C89, undefined behavior is interpreted as, “The C standard doesn’t have requirements for the behavior, so you must define what the behavior is in your implementation, and there are a few permissible options”.
Wrong. The entire point of "undefined" was to mean "the standard imposes literally no requirement."
"Implementation-defined" and "unspecified" are essentially what the author is thinking "undefined" means. To quote the standard:
> 3.4.1 implementation-defined behavior unspecified behavior where each implementation documents how the choice is made
> EXAMPLE: propagation of the high-order bit when a signed integer is shifted right.
> 3.4.3 undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
> EXAMPLE: the behavior on integer overflow.
> 3.4.4 unspecified behavior use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance
> EXAMPLE: the order in which the arguments to a function are evaluated.
The author claims that, recently, the ocean has been getting concerningly wet. GCC 1.17 dates back to 1988 and would launch nethack as an easter egg in response to something as tame as implementation defined behavior (#pragma use.)
This is incorporated into a gamedev variation on "nasal demons" I've heard/parroted: "It can launch Nethack, it can launch nuclear missiles at a cow ranch in Alaska, or worse - it can do exactly what you expected it to do." The first two being perfectly standards-compliant behaviors when your code has an RCE vulnerability thanks to invoking undefined behavior via, say, buffer overflow. The last turning compiler upgrades into a minefield, and bug trackers into a dumping ground for once-encountered heisenbugs.
As a metapoint, the fact that there is this confusing epistemological minefield around the specified semantics is a bad thing in itself, separate from the evils of UB.
The implementation of stdlib necessarily contains code which invokes implementation defined behavior, and the reason that the compiler implementation doesn't treat it in like manner is simply that it doesn't identify it as such or is prevented from doing so. So the treatment of libraries doesn't need to be consistent with that of user programs. Practices that ensure that such code is not optimized away include scattering the information needed to do so across multiple translation units, so that the compiler does not put them together.
I've become convinced that the original ANSI committee didn't intend to create the monster they did.
The reason is that Dennis Ritchie wrote a submission to the committee¹ that described the ‘noalias’ proposal as “a license for the compiler to undertake aggressive optimizations that are completely legal by the committee's rules, but make hash of apparently safe programs”, and, that “[i]t negates every brave promise X3J11 ever made about codifying existing practices, preserving the existing body of code, and keeping (dare I say it?) ‘the spirit of C.’”
Those comments describe what ‘undefined behaviour’ turned in to. The only reason dmr and others didn't make the same objections to it is that nobody realized at the time what the committee wording implied.
I don't buy these arguments. Suppose you have a little bit of UB:
char foo[10];
foo[10] = x;
That's a buffer overflow, and a clever attacker can supply input for which this really will format your hard drive. Writing a standard that allows literally anything for this particular example of UB but somehow requires sensible behavior for other types of UB is not obviously possible. Merely saying that it can't happen doesn't make it so.
I believe the committee's intent was, as they stated, to ‘codify existing practice’. That means that the compiler generates code for the statement as written (in this case, a store into location foo+10) and what happens when it runs is up to the machine and not the compiler's problem.
At the time ANSI C was under way, I was working on a commercial C compiler and following developments closely. I don't think anyone anticipated what ‘undefined behavior’ would turn into.
The problem with that approach is it inhibits a lot of perfectly-sensible optimizations. If you force the compiler to leave it up to the machine, you force it to preserve the same stack layout, the same memory write, and the same execution order relative to everything else.
You basically forbid it from working with an execution model at any level higher than "a giant array of bytes," because now you have to preserve the naive behavior of that giant array of bytes. But this is not really useful for correct programs.
The committee's intent may have (accidentally, indirectly) been to codify compilers that do very little optimization, but that doesn't mean it was a good idea.
Regardless of the cause (if it's that "one word" or not), the behaviors are crazy-making and there's a trend of more aggressively exploiting UB for optimizations.
Honest question, since I have no expertise in the field: If
C89 says that "Permissible undefined behavior" includes "ignoring the situation completely with unpredictable results" can you give an example of something that would constitute non permissible undefined behaviour according to the standard?
The author's interpretation is completely spot on. C89 does not allow the implementation to deliberately break a program that invokes undefined behavior, unless they document what they are doing. It may allow the program to break due to ignoring the situation.
> There is absolutely no requirement to document what those "unpredictable results" might be.
Only in that case: when the situation has been ignored completely.
If the situation has not been ignored completely (for instance, the compiler analyzes for the absence of that type of undefined behavior and optimized based on that), then it is not permissible to behave with unpredictable results. The behavior now falls into a "documented manner characteristic of the implementation", or else terminating at translation or execution time with or without the issuance of a diagnostic.
One way to look at this is that they changed "permissible" to "possible" for a reason. What might that reason be? Almost certainly that the word "permissible" gives requirements.
On the other hand, this is arguably just a simple defect in the C89 wording.
In the previous sentence, it is clearly written "... for which the Standard imposes no requirements."
The issue is then that "permissible" appears to give requirements.
That's an apparent contradiction in wording, which they rectified by changing "permissible" to "possible", which is a justifiable action that doesn't change the language.
> If the situation has not been ignored completely (for instance, the compiler analyzes for the absence of that type of undefined behavior and optimized based on that)
Except optimizing based on undefined behavior is also a form of ignoring the situation. For example, say you have code that can:
1. Execute Action A if undefined behavior is present, or
2. Execute Action B if undefined behavior is not present.
A compiler could look at these two possibilities, see that one of them invokes undefined behavior, ignore that situation completely, and optimize the remaining path. And that would be perfectly in line with the wording of the standard.
The interpretation you and other commentors who dislike optimizations based on UB favor is that "ignoring the situation" should mean "treat the code as if it was conforming". Both are valid readings of the standard, and the only way one will become the dominant reading would be for the standard to use less ambiguous wording.
> Except optimizing based on undefined behavior is also a form of ignoring the situation.
That is illogical.
1. The optimizing is based on undefined behavior (your exact words). (Actually it is based on the assumption of freedom from a specific undefined behavior, but the assumption is somehow tied to that case of undefined behavior.)
2. The situation that is to be "ignored completely" is the undefined behavior.
3. So the "based on" and "ignore completely" both refer to the undefined behavior.
4. Contradiction: "based on" is quite opposite to "ignoring completely". You can only be doing one or the other.
Also, you're neglecting to provide a plausible hypothesis as to why "permissible" had to be changed to "possible", if it
makes no difference in meaning.
I don't think that the original ANSI C committee would have signed off on that change. It's not an interpretation of existing committee intent, but a case of new people changing the language.
The purpose of, say, numeric overflow being undefined is that implementations didn't agree on how it was handled. Compilers simply passed that on to the machine code. that's what it means to "ignore the situation completely": just compile the code, and let whatever happen in the translation. Programmers who understand the machine could infer the behavior anyway, for their platform.
> see that one of them invokes undefined behavior, ignore that situation completely
Removing something from consideration after evaluating it is not compatible with the wording "ignore completely". It might be compatible with "ignore"; I could swallow that much. Definitely with the "completely" though. The C89 authors added that adverb there for a reason.
> And that would be perfectly in line with the wording of the standard.
Well, with the current wording, not the historic wording in C89.
It's illogical because of poor wording on my part that didn't really match what I wanted to say, yes. Sorry about that. As you noted, a better phrasing might be "Optimizations based on ignoring executions that result in undefined behavior is also a form of ignoring the situation". Hopefully that better captures my intent.
> Also, you're neglecting to provide a plausible hypothesis as to why "permissible" had to be changed to "possible", if it makes no difference in meaning.
I didn't intend to explain that change in wording because in my opinion it doesn't make a difference as long as the "ignoring the situation" clause remains.
> that's what it means to "ignore the situation completely": just compile the code, and let whatever happen in the translation.
That's one interpretation. The other is to ignore the code that invokes undefined behavior and optimize as if it was not there.
> Removing something from consideration after evaluating it is not compatible with the wording "ignore completely".
I'm not sure I would necessarily agree that "completely" would preclude evaluating the situation at all, but my command of English is lacking.
I think it actually wouldn't matter; given some code that may or may not exhibit UB, there are arguably two ways to end up with only one case to consider:
1. Ignore the code that exhibits UB
2. Ignore the UB and treat the code as conforming
Evaluation of the options will have to take place either way, whether it is at compiler run time, compiler design/write time, both, or something else.
I think the only fact that rescues this situation is that in the same paragraph, there is wording which assures the implementor that UB "imposes no requirements" (famous fact everyone basically knows). The later "permissible" is an unfortunate word that appears to give requirements. Without the "imposes no requirements" to contradict that word, it would be a lot more troublesome.
I wrote a detailed reponse to the blogger, which is "awaiting moderation". I not only point out that the C89 wording is contradictory, which justifies the replacement of the word, but that all sorts of common, useful optimizations are predicated on the absence of the program's reliance on undefined behavior.
If we cannot optimize based on the absence of undefined behavior, we cannot cache variables in registers or do CSE, or only in a lot fewer possible cases.
At best we can do back-end things that just improve the compiler's output without regard for the original program (optimal allocation of real registers to pseudo-registers, peephole optimization, jump threading, ...)
If your comment never makes it out of moderation limbo (or maybe even if it does), would you mind posting it here?
How does not being able to assume the absence of UB prevent optimizations like those you listed? Not that I'm doubting you; this is an area I'm not very familiar with.
And "do whatever you want" isn't really what compilers are doing either. Out of all possible program executions, they are ignoring the situations where undefined behavior is invoked. That (arguably) matches what the standard says, does it not?
Take the example with writing to NULL. Ignoring the situation means writing to NULL anyway or not writing to the address, not removing the if statement. Making assumptions about the value of a pointer based on UB is not ignoring the situation, that's trying to take advantage of the situation.
C statements aren't translated one-for-one to machine code, so you can't just talk about 'removing a statement'.
In any case, lots of undefined behaviour is in the standard exactly so that the compiler can take advantage of the situation.
That's mostly so that the optimizer's job is simpler.
Eg the compiler can assume that you are checking for null yourself already, so reasoning-by-contradiction it can assume that when an execution path looks like it might access a null pointer, that it means this path is dead code. (And there's lots of dead code around.)
Similarly, the compiler is allowed to assume that loops that have no side-effects terminate. (Or rather, non-terminating loops without side-effects are undefined behaviour. So you can't rely on them to implement a busy wait.) See "C Compilers Disprove Fermat’s Last Theorem" https://blog.regehr.org/archives/140 for more about this.
The whole point of the null pointer example is that "you are checking for null yourself" doesn't work in that situation, because the compiler optimizes out the check. It becomes impossible to do the right thing.
> Making assumptions about the value of a pointer based on UB is not ignoring the situation
But it is? Out of all situations, you ignore those that invoke UB. It's technically correct, but can produce bad results. What you want is to ignore the undefined behavior itself, and to treat the code as conforming. That's an equally valid interpretation, and the only way to definitively get one interpretation over the other is for the standard to change that wording to say so.
> The only way to definitively get one interpretation over the other is for the standard to change that wording to say so.
No. If the interpretation of the standard results in a behavior that makes the result obviously worse, nobody should favor such an interpretation.
Even if the behavior is "undefined" as per standard, that doesn't mean that any compiler should "define" its own interpretation and implementation in a way to make every resulting program worse.
They obviously "can" and obviously are already doing this, but even that doesn't make such an approach good or desired by any user of that compiler.
That's the crux of the issue though, isn't it. I'd like to think that the compiler developers are not programming at random, and that the optimisations do improve the compilation of code that only uses defined behaviour.
If the only defined value for a condition is effectively:
if (false) { ... }
surely removing the branch is the best thing to do, regardless of what is inside the braces?
If a language makes it easy to accidentally write if (false) when the programmer was trying to test some real condition, then surely that's a problem with the language, not the compiler.
But it's not "if (false)" that was originally written. It's the problem only when the language lawyers search for the "loopholes in the standard" to derive the meaning that is turning
if (make sure something bad is not going on) {
do_something
}
to the following:
"I clever compiler author devised a way to analyze do_something and I see that you the writer of the code managed to write do_something in a way that can be considered "undefined behavior" according to this standard I'm using, therefore I am "allowed" to also remove the "if (make sure something bad is not going on)" to... nothing! See how fast the resulting code now is!"
It's searching for the loopholes to do something directly against the intention of the writer of the code that people have problems with.
"I've said this before, and I'll say it again: a standards paper is
just so much toilet paper when it conflicts with reality. It has
absolutely _zero_ relevance."
"There are competent people on standards bodies. But they aren't
_always_ competent, and the translation of intent to English isn't
always perfect either. So standards are not some kind of holy book
that has to be revered. Standards too need to be questioned."
I'm glad my compiler searches for those situations and removes the unneeded if. I write a lot of generic code - often I need the if only in some situations and write it trusting the compiler will figure out the situations I don't need it and remove the if. Without this I'd have to spend a lot of effort maintaining hundreds of almost the same copies of my code different only where there is a might or might not be needed if. Not to mention all the likely bugs when someone doesn't select the right version of my code to use.
In the given example, the if check for null was NOT unneeded at all, it was vital for proper program operation. Optimizing it away caused a bug that should not have occurred.
That happens once in a while when you try to exploit undefined behavior. I don't do that in my code. I use undefined behavior sanitizer to prove that statement (without that tool I make mistakes)
But it's not the programmer trying to exploit undefined behavior, it's the compiler. That's just wrong. How would you check for a null pointer in a way that the compiler can't optimize away?
I sort-of agree with you that 'exploiting' undefined behaviour is a bad thing. But that's what C (and C++) are all about: the undefined behaviour everywhere allows the compiler to make lots of optimizations without having laboriously to prove that they are safe. And those languages are all about speed at the expense of safety.
If you want (more) safety, there are languages for that.
Including C dialects you get via various compiler flags. Eg the Linux kernel uses a special gcc compiler flag to keep all null checks. For this very reason.
> If the interpretation of the standard results in a behavior that makes the result obviously worse, nobody should favor such an interpretation.
This would be true if everyone actually agreed that the more aggressive interpretation is obviously worse. Since said interpretation is claimed to result in better performance, though, it's not very clear-cut.
That's the nature of ambiguity; either you rely on convention to establish a canonical behavior, which doesn't appear to be likely to happen any time soon and isn't guaranteed to remain the same over time, or you change the wording to no longer be ambiguous.
> Even if the behavior is "undefined" as per standard, that doesn't mean that any compiler should "define" its own interpretation and implementation in a way to make every resulting program worse.
The standard offers no interpretation of any undefined behavior, that's literally what meant by undefined. There is no execution of a program that contains UB that could be considered 'valid' by the standard. The compiler has no choice, it has to define its own interpretation.
> The standard offers no interpretation of any undefined behavior, that's literally what meant by undefined
Correct.
> There is no execution of a program that contains UB that could be considered 'valid' by the standard.
That strikes me as backward. It would be more correct to say There is no execution of a program that contains UB that could be considered 'invalid' by the standard.
Even this isn't quite precise enough though. If there's undefined behaviour in a function which is never used, that doesn't cause the program to have undefined behaviour.
> The compiler has no choice, it has to define its own interpretation.
No, it doesn't. That's only true of implementation-defined behaviour. In the case of undefined behaviour, the program doesn't even have to behave the same way every time.
The unsigned short to signed int promotion is an example of "compiler-manufactured" UB. The compiler could have chosen a type promotion that does not cause UB.
I believe compilers are constrained by a combination of ABI and the standard here. Section 6.3.1.1 of the C11 draft states [0]:
> If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
So the type that the compiler promotes to is determined entirely by the widths of unsigned short and signed int, which in turn are defined by the ABI. There is no room for picking a different promotion here.
> The standard offers no interpretation of any undefined behavior
"The standard" is not compiling my program, the real compiler is doing that, and the authors of that specific implementation should do the right thing, in spite of the standard leaving something "undefined."
> The compiler has no choice, it has to define its own interpretation.
Correct. And it can define it as to not to be an adversary to the writer of the code.
> it can define it as to not to be an adversary to the writer of the code.
We see peculiar things happen when there's undefined behaviour, because of the way compilers' optimisers work. It would be possible to tame the way undefined behaviour manifests, but this would trade off against the compiler's ability to optimise. If that weren't the case, everyone would be doing it.
"When it comes to optimization, there are things that are "obvious", and
that will never generate any discussion at all, because they are faster
ways of getting the exact same result, with no question about it. Nobody
will ever fault a compiler for evaluating constant expressions at
compile-time rather than doing the math at run-time and getting the same
result. The two are indistinguishable from a result standpoint.
But optimizations that can change the value of an expression are slightly
different. They should make sense. You should be able to _explain_ them."
"If you cannot explain the results, you end up having to argue about
word-weaseling in the standard, and people _will_ disagree on what the
damn thing means. Because standards aren't mathematically exact things."
LOL... just crash the app, and then document that it's supposed to make demons fly out of your nose and that it only works with blood sacrifice hardware installed.
"Demon summoning failed, nasal hardware not found."
You are looking for a version of C that traps on undefined behavior–which is totally fine!–but do know that this would either be something that does not optimize code and/or contains runtime security checks for things that C traditionally does not have checks for.
Trapping implementations have a long and honourable history on platforms where the hardware supports it. Ironically, the whole reason most undefined behaviour is undefined is to permit trapping implementations - implementations that do something safer than just continuing with an implementation-defined value - but instead modern compilers treat it as a license to do something even more dangerous.
This particular bug actually comes from C++. In C++, threads may be assumed to always make forward progress, i.e. side-effect-free infinite loops are UB. However, this is not true in C, and it's definitely not true in Rust, yet Clang makes this assumption anyway.
It gets a lot worse; once Clang sees an infinite loop and decides it's UB, it gets to make all sorts of thoroughly invalid and silly "optimizations"; see this SO question for some examples: https://stackoverflow.com/questions/59925618/how-do-i-make-a...
> This particular bug actually comes from C++. In C++, threads may be assumed to always make forward progress, i.e. side-effect-free infinite loops are UB. However, this is not true in C
That's very much disputed. The C standard requires programs' behaviour to be as-if executed on the C abstract machine when they terminate. It's not at all clear whether the standard imposes any requirements on programs that would not terminate when executed on the C abstract machine.
C11 does have a special case rule for infinite loops. Section 6.8.5 paragraph 6:
"An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate."
(I'm personally not happy that this is stated in terms of that the implementation may assume rather than what a program may do.)
Note that this does not apply to infinite loops with constant conditions. For example, given `while (1) { /* ... */ }`, the above doesn't apply and the implementation may not assume the loop will terminate.
> Note that this does not apply to infinite loops with constant conditions. For example, given `while (1) { /* ... */ }`, the above doesn't apply and the implementation may not assume the loop will terminate.
Where is that guaranteed? I can see that the above rule suggests that the implementation may not make that assumption, but it doesn't actually rule it out.
pub fn mul1(mut a: i32, b:i32) -> i32 {
let mut out = 0;
while a != 0 {
out += b;
a-=1;
}
out
}
pub fn mul2(a: i32, b: i32) -> i32 {
if a == 0 {
0
} else {
b + mul2(a-1, b)
}
}
Both are optimized to imul. But that's not actually correct: neither of these should terminate for negative a!
Incidentally, this is actually a rust bug, caused by LLVM performing this optimization despite it being illegal in rust.
Interesting! In C/C++, optimizing to imul would be correct since if a was negative it would eventually underflow, and signed integer underflow is UB. Therefore ignoring that case is fine.
Presumably the same is true for LLVM IR, which would mean that it's Rust's responsibility to emit code that checks for that case since in Rust under/overflow is defined. Very interesting compiler bug! I noticed that they haven't fixed it for the latest version. Did you submit the bug report to Rust?
I think it depends on the flags the multiplication operation is emitted with; namely, the nsw and nuw (no signed/unsigned wrap) would denote whether the optimization LLVM does is correct. If rust emits those flags then that would be a Rust bug; if Rust does not, then it might be an LLVM bug (I'm not sure what LLVM's semantics are for regular imul without flags).
I didn't: I assumed that[1] was sufficient. But (I assume, knowing little about C++) you're right that it's actually not an infinite loop, it's signed integer underflow. Given that, I'll file a bug.
I'm not entirely sure how that is an example I asked for. In this case, it seems like the compiler in both cases was able to see that the loop always terminates, either by hitting 0 directly, or by wrapping around and then hitting 0, or by simply assuming integer underflow to be UB.
If you propagate "some condition," that could allow dead-code elimination later on, especially if an if-statement uses "some condition" as a predicate later. This seems redundant in hand-written code, but keep in mind the if-statement may actually be from an inlined function.
If "some condition" is actually testing a variable for equality, then you could do constant-folding.
- doesn't terminate, doesn't have side effects -> optimization doesn't matter
- doesn't terminate, has side effects -> optimization isn't valid
- does terminate, doesn't has side effects -> optimization would be valid, but why would such a loop exist? One example that was brought up was generic code, but I'm not entirely convinced yet.
Because of interaction with other undefined behaviour (in the body of your loop), you might think you are in a different case than what the compiler thinks.
Also the definition of 'side effect' here is a weird one: reading and writing to variables is not considered a side effect, even though a different thread could change them while your loop is running.
(And to make it closer to our situation, assume that the loop being executed was looking for something that actually has a counterexample that could be found after a long time; instead of Fermat's last theorem.)
So, generic code with a loop, which, depending on parameters, sometimes has side effects and sometimes doesn't, and in the case that it doesn't have side effects it is still not easy to recognize if the loop terminates?
I suppose that is possible, although I'm having a hard time coming up with any reasonable examples.
Even that `ud2` is only there because rustc sets the `TrapUnreachable` flag for LLVM. In older versions you would get nothing at all, so calling main would fall through to whatever was next in memory.
Correct, the bug is that LLVM (being mostly built around C++?) did not have a concept of infinite loops that could not be removed for quite a while. Here's the Rust bug about this, which also contains links to relevant discussion on the LLVM side, including some forward progress: https://github.com/rust-lang/rust/issues/28728. I believe the current status is that machinery has been added that language frontends can use to mark loops as having side effects, which will indicate to LLVM that they should not be optimized out.
But only if system call optimizations are enabled (otherwise it can't add the call to pause()), and only if you did not specifically disable the optimization for infinite loops, and only if you do not use the -Os option.
> The bug exists in LLVM regardless of whether Rust has its own fork, and you don't even need to use Rust to reproduce it.
It is not a bug in the broken sense since it was intended. There is simply no way to create a formally infinite loop in LLVM IR.
Clang in C mode (and rustc) are generating known broken LLVM IR for their semantics, Clang in C++ is fine. They could add something to the generated IR to make it formally finite in the IR instead of waiting years until LLVM IR adds something for that niche case.
> Furthermore, using your system (ie. unmodified) LLVM with Rust is supported, and still exhibits this bug.
That has nothing to do with what I said. The point is that if no frontend cares for such a feature, LLVM may not care about adding it.
If rustc was inside the LLVM project, like Clang and others are, then it would be a "bug" of LLVM and they would have an incentive for fixing it.
But given Clang does not care either in C mode, I don't have my hopes up.
This topic comes up semi-regularly here on HN but I think the underlying issue is that standards do not exist in a void, yet a certain group of people seem to think that they do --- the fact that a standard "imposes no requirements" should really be taken to mean "think about what really makes sense", not "do whatever the fuck you want". Hence the "but we still complies with the standard!" defense from compiler authors when faced with perplexing results of UB is totally ignoring the reality and practicality of what a compiler is for. Note that the standard writers have even tried to give that hint, with the "behaving during translation or program execution in a documented manner characteristic of the environment" phrasing which is precisely what programmers are usually expecting, but it seems few actually took the hint.
Different people have different ideas on what "makes sense". The point of a standard is to not have to use common sense, which is usually not common al all. So there is no point is saying that when the standard does not say anything you should use common sense. If the standard says nothing, you should just avoid relying of that thing.
The compiler is not responsible for the quality of the source code. The programmer is. The compiler is responsible for the quality of the machine code, assuming that the source code is correct. It is good to have tools to check the quality of source code, but they are a different thing from compilers.
Personally, I am happy that the compiler optimizes the code for me when I write correct C/C++ programs. If I make a mistake and inadvertently do UB, I take the responsibility, without shifting it on the innocent compiler.
> The compiler is not responsible for the quality of the source code. The programmer is.
The last 40 years have proven how well it works in practice, specially if one plugs some kind of networking into it.
Taking the responsibility should be taken to the same liability level of other engineering disciplines, then it would be interesting to see how long the myth of only bad programmers write bad C will survive.
I don't know how much this actually affects your argument, since you seem to be making a more general statement, but for the record: it's extraordinarily rare for vulnerabilities to be caused by compiler optimizations specifically. I've heard of a few instances (BIND denial of service; Linux TUN bug becoming exploitable instead of a denial of service; IIRC something with Native Client), but they're interesting precisely because they're rare. So I'd say that one shouldn't oppose compiler optimizations directly because of the risk of security vulnerabilities, although of course the prevalence of vulnerabilities can still be evidence that C programmers are prone to mistakes in general.
But the only thing illegal about it is that there's no semicolon after 'char x'.
Writing to (theoretical) padding in this way is fine. Though, you might want to check the manual to see if this could generate a trap representation on your platform. (This won't, on any platform I'm aware of.)
Also, the sizeof operator always returns 1 for char and unsigned char (C99: 6.5.3.4), so there's no reason to do sizeof(char).
I didn't understand why you would want to write that code. To me, a struct is a (very very mild) abstraction, since there can be padding added for alignment and if you don't need to know about each field's offset, then you shouldn't care.
So just write
memset(&a, 0, sizeof a);
and be done, that will zero any padding too and end up just doing the right thing. It's also very clear to the compiler what you're after, and I wouldn't be at all surprised if a compiler chose not to call memset() for this, and just does the equivalent of
a.x = 0;
a.y = 0;
or perhaps, by knowing about padding, doing a properly-sized single write to both fields at once.
Why is beside the point. I'm not making a judgment about morality or motivation. I said that one of the few concrete examples in the blog post was factually wrong. It shouldn't be used as an example for the point the author is trying to prove.
Also, for proving the point of the blog post, the example you showed instead would have been wrong in the same way as the original example from the blog post.
The point was about writing to the (theoretical) padding between fields. Your example would still have written to this padding (if it existed) in the same way. And if this padding did exist, it still wouldn't have been illegal, in either example.
> that will zero any padding too and end up just doing the right thing
No, it will not; the compiler can and occasionally will transform this into two writes that don't touch padding. It is surprisingly difficult to actually zero out padding bits. (However, your code is better than the one mentioned above, as it does correctly zero out the structure's members.)
Did you miss the part where writing to a padding byte might generate an error on some architectures? Granted I don't know of any such architectures, but C was designed to work on just about any weird thing you can dream up.
It actually would be guaranteed to 0 it if it were instead unsigned chars. For plain char, the implementation has to specify if it is signed, and if it is, what kind of padding it has. If it has none, it would also be guaranteed. Therefore, you can just check the compiler manual to see if this guarantees zeroing. (None of the mainstream compilers on mainstream platforms will have padding for signed char, either.) (C99: §6.2.5 and §6.2.6.2)
(If you know a platform where this is not true, I'd be interested in hearing about it!)
Edit: Oh, also, an implementation is allowed to have different padding in struct fields than the padding of the type itself. But it has to define this, so you would be able to look it up in the manual to see if it's different. (§6.7.2.1: "Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.")
This is my point, No platform I know of puts padding between bytes, so the behavior should be clear, but since a platform CAN put padding between bytes, and writing to said padding is UB, some compiler designers thinks this a license to do what ever.
It's not. Writing to the padding is always allowed, on every platform. [1]
As for whether or not every field will be zeroed in the example, and whether or not you can accidentally generate a trap representation by writing to this padding, each implementation of C99 (platform/compiler/whatever) must specify somewhere what the padding for chars in a struct is. You can look it up from the manual, or whatever documentation is provided.
If this makes you scared about your software being built on an unknown platform and where you need to zero every field of the struct in the fashion shown in the example, then you can just keep a list of the implementations where you know it's safe in the readme, or have the build system stop if it's not on the approved list where you've read the manual for that implementation.
You can go ahead and fill in x86/AMD64/AArch64 for gcc/clang/msvc, because it will be fine for those. Also Power and other common stuff.
[1]: If you have an example of where it's not, I would be very interested in hearing about it!
I would actually go further and say that writing to padding is legal C, period, independent of platform. It just doesn't have to actually do anything. (In an extreme example, a call to memset can do piecemeal writes around the padding to only touch the parts of a structure that will be accessed, which can be an issue if you actually want to scrub those bytes.)
Sorry, You are right. writing is allowed, but assuming padding is not.
Yes, it should work on all the mentioned platforms, and that's my point. It should be possible to write code that assumes there is no padding on the platform you use. Some C compilers see it differently, because they think that since padding is not defined by the C standard they can do what ever they want even on platforms where it is defined.
> Some C compilers see it differently, because they think that since padding is not defined by the C standard they can do what ever they want even on platforms where it is defined.
No, they don't. Implementation-defined behavior means that the behavior on any given implementation should be defined, and you are allowed to depend on that behavior being what it is.
You are probably thinking of something like integer overflow or unaligned pointer access, where the historical justification for it not being allowed is based on differences between architectures, and thus it could hypothetically have been made implementation-defined behavior. But the spec chose to make it undefined behavior, which is why compilers can aggressively optimize assuming it won't happen, even on architectures where the 'obvious' assembly translation has some known behavior.
However, that does not apply to padding. There is a type of potential undefined behavior involved in the example: if padding exists, then the memset does not cover y, so y is left as uninitialized, and if you then read from y (not included in the example) you would get UB. [1] But whether padding exists is implementation-defined, so if you know your implementation does not put padding there, you can safely read from y. As far as I know, all C compilers in common use respect this.
Sorry to nitpick, I hope this isn't too annoying -- do you have an example of an implementation (platform, compiler, whatever) where this statement is true?
Some C compilers see it differently, because they think that since padding is not defined by the C standard they can do what ever they want even on platforms where it is defined.
I am genuinely interested to know of one like that.
It wouldn't be a very useful C99 compiler, because C99 says that it is OK to do this, and lots of C99 code does this. But I would like to know about the existence of an implementation like this for use as an example in the future when talking about this topic.
The example is sizeof(char)*2, not sizeof(a). If there is any padding at all the example as given will not guarantee zeroing out everything. Not sure what that has to do with unsigned vs signed.
C99 makes a specific distinction about unsigned char not having any padding in its object representation. It doesn't make this distinction for signed char or plain char (because plain char might be signed char.)
In practice, on all major implementations that I know of, both signed and unsigned char have no padding.
You might have misunderstood what I wrote. I was saying
> memset(&a, 0, sizeof(char) * 2);
can be guaranteed to be OK for reading (in addition to writing) if you check the manual of your implementation. C99 says the relevant padding/alignment rules need to be specified or documented somewhere. Of course, memset(..., sizeof a) will be fine, too, without having to check the manual to see what the padding/alignment rules are.
In the just previous post:
> It actually would be guaranteed to 0 it if it were instead unsigned chars.
and bringing up padding. But this is padding within the char, not padding within the struct. It is because of the latter, that sum(sizeof(members)) is not necessarily == to sizeof(struct).
The sizeof(char) and sizeof(unsigned char) are both defined as 1, and of course the bit size of an unsigned char including padding is CHAR_BIT.
And it turns out because of a char* must be able to access at least every accessible char of every other object, and have the weakest alignment that alignof(char) pretty much has to be 1 except in a contrived example http://port70.net/~nsz/c/c11/n1570.html#6.2.8. Further, although not required, struct member padding will be a direct result of alignof, hence you will pretty much never find a wild example where sizeof(char)*2 == 2 != sizeof(struct a)... so maybe it is fair to call it a guarantee?? But, still, I think the way you're saying it just confuses the alignment/padding issue further.
I needed to cover both types of padding to be complete. So I mentioned both types.
For structure and union fields padding/alignment, the implementation must document that padding/alignment somewhere. So you would be able to look it up from the manual ahead of time. It's not undefined behavior.
Citation needed. As far as I'm aware, memsetting structs is perfectly fine. The problem are things like memcmp as padding bytes take unspecified value.
And I believe that's wrong. Memset copies values to bytes, regardless if they are padding or not. You just can't rely on padding bytes keeping that value. In particular, writing to a member may fill padding with arbitrary garbage (though the standard goes beyond that, ie you shouldn't rely on padding values ever).
Why would it be wrong? If there's 3 bytes of padding between x and y, then writing 2 bytes of 0s to &a would zero out x and the first padding byte, but would not zero y.
Yeah, my bad. I missed the point of the example (ie that the size argument might be too small). I assumed the point was memset writing to padding bytes.
It is indeed true that y might not get zeroed in principle (though I doubt that'll happen in practice for this particular example as padding gets introduced to maintain alignment, and alignment factors a type's size, which is 1 in case of char).
I'm mildly surprised that wouldn't invoke undefined behavior if there was padding between the fields of the struct due to writing over only part of the second field, but then again I'm not as intimately familiar with the C standard as I should be.
Bytewise access is always possible and won't violate the strict aliasing/effective typing rules. The worst that could happen when messing with IEEE floating point values that way is creating a signalling NaN (if they are available).
No, it is true. The point is that the code in the example is incompetently written, if you want to zero out a struct foo, you should memset sizeof(struct foo), not /* manually calculate what you think the size of foo will turn out to be */ ...
> if you want to zero out a struct foo, you should memset sizeof(struct foo), not /* manually calculate what you think the size of foo will turn out to be
Exactly. Any other approach than using sizeof(struct foo) is definitely wrong.
What author tried to do with
sizeof(char) * 2
never worked in C except by accident, and it was so even in 1975 when no modern standard existed and "undefined" wasn't misused.
It's fine to write this if you know it's OK. There's nothing inherently wrong with it, as far as C99 is concerned. Of course, if you want to zero the entire region of the struct, sizeof(MyGuy) is better. (Though, it may write to useless padding bytes that never get read... but, it might not be useless if you're a kernel and don't want to leak information via uninitialized padding!)
I do memory twiddling a lot in C. The answer is what happens in practice depends on the machine alignment. On an 8 bit machine generally alignment is 1 byte and there is no padding. On a 32 bit machine yeah you'll have 3 bytes of padding for each element.
Me I think alignment is increasingly a bad idea. Most modern processors don't deal with memory in word sized chunks anymore. Padding just increases cache pressure.
The example above has no padding between each field, because the implementation specifies how much padding there should be. For clang, gcc, and msvc, on x86 and AMD64, there is no padding between chars in a struct.
if the struct were instead { char a; int b; } then there would be 3 bytes of padding between a and b.
The example is not trashfire and is in fact totally valid.
Common ARM processors will fault if you try to perform an unaligned access. You must align where required.
Edit: correction to mistake above: I originally wrote { int a; char b; } and should have said the padding came after b. I've fixed it to match the explanation text and have the padding between. I should really be writing these posts in a separate text editor.
Yes! Sorry, my mistake. int then char on x86 will put the padding at the end of the struct, not in the middle. I had originally written it that other way (char and then int), then edited it to be the other way (int and then char) but didn't update the other text. I edited the post to have this correction.
For x86 and XMD64. What about arm, Mips, Sparc, alpha, whatever what in the IBM mainframes... All of the above have C compilers and each does something different. I don't know the rules for each, but I know some of them have really weird rules in specific cases.
You misunderstand. In context of the grandparent: all the world is NOT x86/AMD64. there are many other processes with many different rules. That his code works on the above two doesn't mean it will on the others.
I took care to mention that x86 is not the only platform, and to what extent they're relevant. I also made sure to not specifically answer as if this is only for x86. I don't know why you're willingly misinterpreting what I said. It's quite frustrating, especially after I took care to not answer as if x86 were the only platform.
I asked you why you're asking about specifics here in a querying conversational, when you can just search the internet and be given the specific answers immediately.
Because you're asking me to reply again, I'll go ahead and answer your question for you: all of those platforms are the same as far as padding/alignment for char is concerned. But just to reiterate, I was careful to not require this kind of knowledge for my replies to make sense.
I don't work with them, but their are weird systems with non 8 but bytes. I don't even know what they are, but every time this comes up in committee those compiler writers speak up
Fields are generally aligned so that they are at an offset which is an integer multiple of the field type's alignment. Type alignment for built-in types is generally the highest power of two which is both (less than or equal to the type's size) and (less than or equal to the maximum alignment for the target ABI). Type alignment for compound types (like struct) is generally the maximum of the constituent types' alignments.
But details may vary from implementation to implementation and platform to platform.
> Most modern processors don't deal with memory in word sized chunks anymore
ARM does. That is a very modern processor in very wide use.
Linux hides this from you by trapping illegal memory accesses and handling them in software. Instead of a crash, you get abysmal performance if you do not align correctly.
I'm not really convinced the author is correct in claiming that a one-word change opened the floodgates to optimizations on undefined behavior. In particular I think:
> Careful reading will reveal that the word “Permissible” has been exchanged to “Possible”. In my opinion this change has lead C to go in a very problematic direction.
is a red herring. In my opinion, the actual problematic phrase is this:
> ignoring the situation completely with unpredictable results
which didn't change between C89 and C99.
It all comes down to what "ignoring the situation" should mean. Compiler vendors appear to interpret this to mean "ignore situations that invoke undefined behavior". Programmers who dislike optimizations based on undefined behavior appear to interpret this to mean "ignore the violation that leads to undefined behavior and treat it like conforming code". Who's right? It's ambiguous.
> C compilers have taken the concept of undefined behavior even further by doing the mental acrobatics of thinking that “If undefined behavior happens, I can do want I want, So therefor I can assume that it will never happen”.
I think this description is a bit uncharitable. I think a better description might be "If undefined behavior happens, there are no constraints on program behavior. Thus, if I assume that undefined behavior never happens, optimize the program based on that assumption, and that assumption is violated, the resulting program behavior is still OK because the standard imposes no rules on what the program must do when undefined behavior is invoked."
> In C89, undefined behavior is interpreted as, “The C standard doesn’t have requirements for the behavior, so you must define what the behavior is in your implementation, and there are a few permissible options”.
is implementation-defined behavior, not undefined behavior. As the C89 standard notes, defining the implementation's behavior is only one of the things an implementation may do, not something it must do.
> is implementation-defined behavior, not undefined behavior.
They're (confusingly) misusing the word "define" there. Implementation-defined behaviour is required to be documented, but every behaviour is something that "you must define" in the sense that you have written down some (possibly implicit/emergent) definition of it as part of the compiler's source code. Ie, it's not "the standard requires you to define this", it's "you are not capable of not defining this, and the standard requires(C89)/doesn't require(C99+) you to pick from the following options".
> The point of a compiler is not to try to show off that who ever implemented it knows more loop holes in the C standard, then the user, but to help the programmer write a program that does that the programmer wants.
The author makes it sound like the people working on optimizing compilers are deliberately seeking out these weird corner cases and selecting some random surprising behavior for them out of a hat, gleefully imagining how confusing it will be for end users. That's not how it works. Optimizers can be extraordinarily complex and need to maximize this ill-defined thing called "performance" in a highly multi-dimensional solution space. They ping-pong around inside this space constrained only by the specific requirements of the standard, and it's not surprising that some of the techniques used would produce some counter-intuitive results if the programmer is breaking the rules and relying on undefined behavior. It's kind of like if you trained a neural network to classify cat and dog pictures, and then you showed it a picture of a fire truck and expected it to give you a useful result.
The idea of a new version of the C standard that defines some of the most surprising undefined behavior is an interesting one though, and I'd be interested to see how much that really impacts the ability of the optimizer to do its work.
I'd love it if the C standard just removed undefined behavior, replaced explicit instances ("the behavior is undefined" to "the behavior is implementation defined") and put in a blanket "Any behavior not specified by this standard is implementation defined". Keep the rest the same, just document the footguns. Implementation defined is exactly as powerful as undefined, it just makes the compiler writer describe what will happen.
What you want is not another language but a compiler that has a well defined behaviour and is not willing to change it between versions when UB in the C standard is triggered.
There is good reasons compilers don't warn about most undefined behavior: there is so much might be but really isn't undefined behavior in the world.
I often write code that won't divide by zero, something I can prove because I know my whole program while the compiler only knows the file/library it is working with at the moment. As such I didn't put in a if divisor is zero check, and now want the compiler to waste my time on a warning that is wrong.
Warning when optimization passes take advantage of UB is a reasonable desire. The compilers are getting increasingly good at this, but I believe it would be far too noisy to blanket enable all of that.
The point is to push the info to the programmer so that the code can be re-written as only defined behavior. Yes this would eliminate the 'gains' of abusing UB, but it would actually fix the errors of abusing UB.
I guess signed indvars would be a prime example. Compilers famously ignore overflow of signed integers when computing loop trip counts.
Would it be better to only ever use unsigned integers for indvars and loop limits? Maybe. But that would reach deep into the types in stdlib for example.
The problem is the compiler cannot prove UB actually happens.
The point of these "use UB to optimize" approaches isn't to screw over simple programs with bugs. It's to better optimize larger, correct programs where higher levels of logic that the compiler is unaware of ensure UB never happens.
They do, but I'm just pointing out that trying to use constexpr as the judge of what is defined is generally not a great idea at the moment unless what you're actually trying to do is find some language-lawyer compiler bugs to file.
Most of the UB people complain about is related to aggressive, compile time optimizations. This may or may not be trivial. Often it isn't, since decades old "working" code can be broken by new compilers. You can argue that the code was always broken, of course.
Yes, but the way these optimizations work, is not that they detect UB and optimize accordingly, instead they optimize assuming no UB. There is nothing to report at compile time.
> If the compiler thinks that writing to NULL, is undefined, it can therefore assume that since you are writing to p, p can’t be NULL. If p can’t be NULL, the entire if statement can be removed and after optimization the code looks like this:
Is that correct?
I thought Linus' rant was about NULL checks that came after a dereference had already happened. In that case it at least makes sense that the compiler would assume the NULL check would be superfluous.
But how could branching to spit out an error and exit before the dereference ever get optimized out? I don't see any undefined behavior in the author's example upon which an optimization could trigger. (Or if it could trigger, it ought to trigger on array bounds checking and many other situations where removing the code would clearly cause bugs.)
You get this bug because the compiler doesn't know that the function will not return and therefor assumes that the de-reference will happen even if the value is NULL. Some compilers have a (non standard) keyword to indicate that a function will not return. adding an "else" will "fix" the issue.
> Some compilers have a (non standard) keyword to indicate that a function will not return
The _NoReturn keyword was added in C11, standardising it. You can also get the now-standardised noreturn macro from stdnoreturn.h under C11, to maintain that compiler syntax.
> You get this bug because the compiler doesn't know that the function will not return and therefor assumes that the de-reference will happen even if the value is NULL.
That would be even worse reasoning.
If the compiler doesn't know that the function will not return then how could it possibly implement optimizations that remove this code?
Not returning from a function is well-defined behavior in C.
Dereferencing a pointer after a NULL check condition which doesn't return is well-defined behavior.
I understand that changing an "implicit else" to an explicit else will clue modern compilers in to the fact that the code shouldn't be removed. Regardless, removing the code in the author's example is dead wrong (quite aside of what one makes of Linus' quite persuasive argument about not even touching NULL checks which follow undefined behavior).
Edit: clarification
Edit 2: Ok, I need a sanity check.
Let's forget about functions which never return. Instead, let us change "write_error_message_and_exit();" to "return 0;" which will signal an error message to the caller. Also, assume there is a "return 1;" somewhere below the author's example.
Is there any compiler at any optimization level which would optimize out my revised conditional?
if you replace write_error_message_and_exit with a return the problem should go away. If you think that is non-obvius, I agree, that's the problem. its very easy to do the wrong thing.
Even if you assume the function returns, I don't quite understand why the if can be optimised away. The function may still have side effects if it returns, which have to happen before whatever assigning to NULL does. Those side effects may include sending data over the network, or happen in the operating system which is protected against whatever the program does, so even UB cannot affect it.
I guess UB is not assumed to include time travel...?
UB isn't a side effect, nor does it imply a sequence point or completion of other side effects. The compiler is largely free to reorder things anyway. And indeed the standard allows the compiler to bail out and throw an error during translation. UB is beyond time. But even if it were bound to time, well, unpredictable results could include accidental time travel.
With UB a program is invalid in its whole, so anticausal effects are possible. Having said that, because of posix and the ability to catch signals, I believe that at least recent GCCs will try to preserve the ordering of side effects.
This is a corner of C I'm not familiar with, but does the standard say anything about functions that never return? If it doesn't say that implementations may assume functions return then that sounds more like a compiler bug than taking advantage of UB.
Previously, I saw discussion around a similar bug where the community accepted that it is not legal for compilers to optimize with the assumption that a function never returns [1]. Obviously, if a compiler can prove a function always returns, it is a perfectly reasonable optimization.
Unfortunately, one of the folks who filed a bug report against GCC submitted an obviously incorrect reproduction procedure [2]. The GCC folks closed the false bug report and the developers worked around the true bug. Some time later, a developer attempted to reproduce the bug with GCC 4.4 and succeeded, but could not reproduce it with versions 4.6 or 4.8 [3]. In my mind, this fact strongly supports the community's conclusion that the optimization is incorrect.
Finally, the intuition described in the stack overflow discussion is pretty sound: it must be possible to use control flow to avoid undefined behavior.
> This is a corner of C I'm not familiar with, but does the standard say anything about functions that never return?
A few things, but not a lot of detail.
> A function declared with a_Noreturn function specifier shall not return to its caller 6.7.4-8
It is recommended, but not enforced, that the implementation warns if this might not be the case.
> The implementation should produce a diagnostic message for a function declared with a_Noreturn function specifier that appears to be capable of returning to its caller 6.7.4-9
They do provide some example code that explicitly says that non-returning functions, that are marked as such, need to be explicit that they don't return. (As C implicitly attempts to return if given the opportunity). Not doing so is Undefined Behaviour.
_Noreturn void f () {
abort(); //ok
}
_Noreturn void g (int i) {
//causes undefined behavior if i<=0
if (i > 0) abort();
}
So a compiler should be able to assume that a _Noreturn function either loops forever or terminates. Anything else would appear to be UB.
The compiler can't assume that a function always returns.
This seems contrary to how (I believe) the compiler works with atomics. If I call some opaque function foo(), the compiler has to assume that foo() could perform sequentially consistent atomic operations, and it cannot move other reads or writes across that function call. Why isn't it also required to assume that a function could terminate the program or longjump out?
Yeah, I think you're right. I guess a more general statement would be that the example optimization may not be valid because that function may contain unknown side effects, including program termination.
In C++ the function could throw, so at least there's that.
Also I believe that, because of posix, compilers will try to preserve side effects before potential UB. At least I couldn't get GCC to optimize out the function call.
> I thought Linus' rant was about NULL checks that came after a dereference had already happened. In that case it at least makes sense that the compiler would assume the NULL check would be superfluous.
IIRC it was an early return from the function rather than an exit function, something like.
The point of the article is that the author wants compilers to tell users when they do unexpected things. I don't see how your point is relevant to that.
And the problem is that users ask the compiler to optimize, and then they complain because it optimizes.
See, optimization can do a lot of unexpected things, but it is very difficult in general for the compiler to tell whether a certain application of (say) range analysis to eliminate dead code in a bunch of inlined calls is going to be unexpected. That's exactly the kind of optimization I want a compiler to do; I can write small functions that are as generic as possible yet cover all the edge cases, and then the compiler can find out which edge cases cannot apply in this particular situation.
If the compiler told the user every time it did something, you'd never finish reading the output. Might as well ask the compiler not to optimize at that point.
The compiler doesn't know it is doing unexpected things. The compiler is doing what's expected given the axioms it has to work with. It would be nice if the compiler added assertions in place to validate that the axioms hold, but that would often be very expensive. Recently I have been using the sanitizers extensively and has been a huge improvement, but of course they are too expensive to leave on in prod. Would be nice if a cheaper, although less exhaustive version were available.
That may be the point, but it's not the premise. Even the title is based on this flawed idea. If the author wants to make a point, they can do so without trying to drive it with incorrect ideas.
The premise is that just because the standard doesn't impose requirements, it doesn't mean that other known factors like ABI, Architecture, and other platform specifics don't matter.
Quite the converse. To me, “permissible” sounds as capricious as does “possible”. I read the article on “mass amateurization” before and am thinking how words must be sounding differently to me as a non-native English speaker and computer scientist. Maybe it had been best to insert “No running with scissors!” to that spec.
The comment section also has a lot of the same justifications for the current interpretations that I see here, with fairly thorough debunkings.
Changing that one word is probably not quite enough, at the very least you need to also not make the remaining text a note or also change the status of notes back again.
Then there is the conflict between "imposes no requirements" and "here is the range of permissible actions". These cannot both be true, and the current dogma is to resolve the conflict by pretending the second part dosen't exist (with, admittedly, some justification). But then why on earth is it in the standard at all?
It allows additional options _in between_ those listed.
I'm skeptical that such an ill-defined rationale-centric change would hold any purchase with modern compiler-writers (it is certainly arguable what points lie "between" qualitatively-described behaviors), but in general a range does not usually include literally all possibilities.
I often see people still choose C even in 2020 for some projects. The usual arguments are that C has stuck with us for several decades and that it is now rock solid. Articles like this make me think otherwise. Why do modern compilers break old code and then claim that it was broken from the start?
> Why do modern compilers break old code and then claim that it was broken from the start?
Because strictly speaking that old code is broken in that it violates the standard. Older compilers either didn't use the more aggressive interpretation of undefined behavior or didn't implement the optimizations that resulted in bad runtime behavior.
Most of the undefined behavior in C arises from traps: if your code, naively translated, would have resulted in a trap, then it is undefined behavior in C.
Preserving when--and even if--a trap occurs is generally not desired, even by most end users. The instructions most likely to cause traps (memory operations and division operations) are expensive, and hoisting them outside of loops, or dead-code eliminating them is incredibly desirable.
When people complain about "broken" undefined behavior, what they usually mean is that they expected to get a very particular kind of trap and the compiler didn't give them that trap.
Contrary to your other replies — old code, written in K&R C, was not broken from the start. ANSI C broke it with undefined behavior (by accident, I think) and ‘volatile’ (not by accident, just not fully thought through).
Other than that - @_kst is correct, it was the same in C89 and C99, and the intention is simply to let compilers make the optimization they like ignoring what happens for UB cases.
That particular behavior (and some other examples of pathological optimizer behavior) is probably an emergent property of how several optimizations combine as opposed to being a specifically targeted optimization. I can imagine something the optimizer doing something like:
1. Determine possible function call targets
2. Eliminate invalid targets from call set
3. If only one target remains, replace indirect call with direct call
That might be useful for something like devirtualization, but also could lead to the behavior exhibited in that blog post.
Indeed, the second blog post [0] shows that if there are two possible call targets, LLVM could generate an indirect call like what you may expect.
That is not "never used" as the programmer argues, and I find it misleading that the title of the blogpost tries to sell it as "a never used function is getting called".
The NeverCalled function has external visibility so it could be eventually called by another CU, and it definitely references (or takes the address of) the supposedly unused "EraseAll" function. In order to be a legal program, some other CU has to call the NeverCalled at some point before main(), and the compiler is just assuming that.
rms was all over this "undefined" stuff. Early gcc versions actually tried to launch the games Rogue or Nethack upon detecting #pragma directives.
Old IBM "theory of operation" books used the word "unpredictable" for this kind of thing. "unpredictable" should frighten programmers. It's worse than "random," because random results have a chance of being caught during unit or system test.
What does the compiler know about memset? Why it should? It can be a function called heyLetsFillTheMemoryOverHere(&a, wrongSize); and still it's the programmer's fault.
In fact, it's a good thing that C doesn't care about memset, and doesn't get in between me and my code. And that's I love about C. It minds its own business and does what I tell it to do.
> If I use my compiler to compile a program on my machine, the compiler knows that I’m compiling it to the x64 instruction set
You can be cross-compiling. I do it every day. The C language doesn't need to know about every little platform out there. If a compiler wants to go the extra mile and help about it, that's a compiler-thing and not a C-language-thing.
If there is a MCU that uses 3 bytes words, why should C be aware of it, 40 years before?
Don't blame the specifications, blame the compiler if you want.
> What does the compiler know about memset? Why it should? It can be a function called heyLetsFillTheMemoryOverHere(&a, wrongSize); and still it's the programmer's fault.
Because memset is defined by the ISO C standard to have specific semantics, which the compiler can leverage to make your code faster.
> The memset() function shall copy c (converted to an unsigned char) into each of the first n bytes of the object pointed to by s.
The standard just declares an interface, to be added to a C library (maybe?). It's not the job of the compiler to check for programmer's faults. The compiler doesn't care and it shouldn't.
> Don't blame the specifications, blame the compiler if you want.
We're not talking about bugs in compilers (in which case you can blame the compiler), we're talking about standards-compliant compilers (in which case you blame the standard).
Really bad example... Memset to zero members of a structure this way is actually something that should trigger a compiler error, because it makes no sense. I have limited understanding of C these days, but I consider that a good thing, because it allows me to look with fresh eyes on the crap I wrote back in the days. And this is one of them. It's nonsensical and similar to the general linux sentiment of naming your variable like random garbage. It's brainfuck. The only thing I can say with some distance is that exploiting undefined behavior in pretty much all cases should yield a compiler error.
There are many people arguing specific semantics of the authors arguments, but I believe the core problem is C and C++ both dramatically overusing “undefined” vs “unspecified”.
The difference is huge. Signed integer overflow is (per spec) undefined behaviour, so an obvious bounds check if UB and so can be removed. If it were unspecified the compiler would be required to be at least self-consistent. Eg it couldn’t do 2s complement in one place, but then treat arithmetic as not being 2s complement elsewhere (the overflow checks). E.g if the compiler emits code where MAX_INT+1 is MIN_INT, then the compiler can’t also pretend that that doesn’t happen.
Undefined should be reserved solely for things that cannot have a specified behavior (UaF, OoB memory, IO weirdness, etc).
While that looks stupid in isolation (and I agree it's annoyingly hard to check for overflow, although gcc and clang have special builtins nowadays to do it), it turns out there are important reasons for that optimisation.
In general, knowing that 'a+1' is '1 larger than a' allows for lots of optimisations, when writing to an array in order we can vectorise, do things in bigger chunks, all sorts of useful and important optimsiations. If every time those were used the compiler had to check for overflow, it would seriously effect performance.
Written under the premises of "performance trumps all".
This is why it getting rid of the underlying software written in C should be a concern, or at very least, adopt hardware and development practices that tame C. After all UNIX/POSIX clones won't get replaced overnight.
Butchers that care for their hands also make use of protective gloves when dealing with sharp knives.
In practice, a lot of software is written in C because it depends on interfaces that are defined in terms of their C APIs, without caring all that much about performance.
No C library is changing their ABI every couple of seconds, and plus many of those tools understand C header files, quite feasible to fix broken bindings every now and then.
The problem isn't changes, it's accommodating multiple versions. Even figuring out where to find headers is not necessarily easy if you're not the local C compiler, for whom the tooling must only begrudgingly exist.
Where is the constituency that would want this "sensible C"? People who care about reasonable behaviour have already moved on to better languages, a la https://www.lesswrong.com/posts/ZQG9cwKbct2LtmL3p/evaporativ... ; the people still using C are those who want "maximum performance" and don't think undefined behaviour is a problem, and so C compiler writers have (understandably) gone ever further in their exploitation of undefined behaviour to get that "maximum performance".
I don't think this can be reversed, and I don't think we should necessarily want to reverse it. We have better alternatives now. Let the C people do their thing, and get on with your life in a language that ensures that programs, even "incorrect" ones, have reasonable behaviour.
I'm pretty sure that a large part of the people working on legacy code bases would like to have a safer language. You don't move on to a better language if that means rewriting your whole project.
MTE is part of ARMv8.5, a hardware specification. There is no iOS device that ships with this–and I would expect very few (if any?) Android devices do as well. Do you have a source of it being a requirement for ARM-powered Android 11 devices? Because I would expect this to exclude almost every currently shipping device…
"Sensible C" would have to be a distinct language from C, at least as the latter is currently implemented, so it would still represent a change of languages for legacy codebases. It could probably interop with the C ABI, but we already have other languages that can do that; it's not at all clear that converting your codebase piecemeal to "sensible C" would be substantially easier than doing the same with, say, Rust.
It would also have source-level interop with C _APIs_, for which the only other language we have is C++, and which for that purpose has the same issues as C.
Cultishness is a spectrum, not a binary. C has always had a certain proportion of adherents who believed that "performance" (of code that implements microbenchmarks with unlimited hours of manual tuning) is the most important aspect of a general-purpose programming language, and that the problems of undefined behaviour are insignificant. I would say that both those viewpoints are crazy, but they are increasingly becoming the mainstream in the C community, via the mechanism described in the link: more moderate people are driven out, rendering the community more extreme, which then drives more of the moderate folk out, and so on.
I see something similar happening with functional programming. Beyond measurable improvements (ie, in error rate, etc) there is often a mindset where if a language does not implement the latest syntactic patterns, then the language cannot be considered truly functional. HN seems a breeding ground for these sentiments, and adherents can be quite self-righteous.
Out in the untamed wilds of the broader industry (and the further you move away from the Stanford sphere) I have found that fanaticism, of all sorts, tends to taper off, to be replaced mostly with the mundane tasks of churning out software for money.
Of course, no matter where you hail from, the influence of C is inescapable. Yet, provincial priests are mostly concerned with budgetary constraints and deadlines. If switching to Scala will make hiring 10% easier, then by all means.
One "fun" thing about working in Scala is that you get attacked from all sides; there are people who say that the language is too weird or too purist, and then there are people who say that the problem is that it makes it too easy to have mutable variables or imperative code, or the fact that it allows you to do traditional OO inheritance.
FWIW, my experience is that when you introduce a bit of mutable state or hidden side effects, you usually regret it later. I probably come across as fanatical at times, but that fanaticism is coming purely from painful experience; these academic, theoretical concerns become a lot less academic when you've had to deal with production bugs that they could have helped you avoid.
I find it impossible to imagine the same being true for the C devotees. I've seen fast and slow programs in all sorts of languages. I've never seen a performance problem in anything other than a scripting language that wasn't either solved with a little bit of profiling, or just fundamentally impossible in any language. I've seen a C++ program rewritten in Haskell for a 5x performance speedup. And I've seen so many segfaults.
FWIW, I've found it pretty easy to escape C. In JVM-land you really don't have to deal with it, at least 99.9% of the time; I've hit like two JVM bugs in my entire career. Scala is by no means the easiest to hire for, but it's a great language for getting on with solving business problems - whether you want to do that by churning out reams of code, or by abstracting out the mundane parts and writing only the parts that are specific to the problem. It makes the easy things easy and the hard things possible; very rarely have I felt that the language was limiting me or stopping me from doing what I wanted. There's plenty wrong with it, but it's the best choice I've found.
It is like the heresy of enabling bounds checking in C.
I have hardly seen a program where having bounds checking enabled was an issue, except naturally for stuff like real time audio or software 3D rendering back in the 16 bit days.
Even on 8 bit, plenty of successful software was written in Basic perfectly fine (naturally games were another matter).
The very few cases that it actually mattered, it sufficed disabling them locally on the hot loop that was actually relevant for it.
When using C++, enabling bounds checking on STL library types or making use of at() hardly caused me not to meet customer expectations.
I don't know anybody who thinks code performance is the most important aspect of "general purpose programming languages". Everybody I know who thinks C is good for performance recognizes it's for specific applications (OSs, graphics, realtime, finance, etc.).
Except nobody I know who uses C holds that view. I'm sure they exist, but without a unifying view there is no cult.
C is a well established and widely known language with many uses. It's certainly not right for everything.
Who is the one with extreme views here?
What are these "better languages" you speak of? Because I've been looking for a good C replacement but to no avail. C++, D, Nim, Go, Rust, Zig, Odin, etc; I've looked into them but none convinced me.
What were your issues with those languages? Several of them sound like reasonable options to me.
For a mature, general-purpose language my default suggestion is OCaml. If you really absolutely can't make garbage collection work for your case (something I've never seen happen to anyone who actually tried) then your choices are more limited (and I'd probably favour Rust, despite its relative immaturity).
There are any number of good languages out there, and I could happily go into the details of which I think offers the best combination of tradeoffs (Scala). But the bigger picture is that a memory-safe language with the ML featureset (in particular first-class functions, parametric polymorphism, type inference, and sum types) should be the minimum baseline these days, and represents a substantial step up from C (at the most basic and pervasive level, being able to do error handling with a result type vastly improves your defect rate). Within that category you have plenty of reasonable choices offering their own particular selling points.
For me: libraries not available or are just wrappers on a C library anyway, inflexible memory management, bad support for OS features, etc... A lot of it is just maturity. C is ancient.
People still programming in C want C, its why they're still doing it. But you can't claim there are "no better languages" when all you're really looking for is C.
You can also move to a static language of course. With C's mostly absent type system, Rust or Haskell might take some getting used to. But Go could be an easy transition, or Java.
The sort of optimizations that compilers abuse UB for aren't that useful, actually, and they don't actually require a compiler anyway; it's more about compiler writers overfitting for benchmarks.
> In C89, undefined behavior is interpreted as, “The C standard doesn’t have requirements for the behavior, so you must define what the behavior is in your implementation, and there are a few permissible options”.
That's a serious misinterpretation. C89 says that "Permissible undefined behavior" includes "ignoring the situation completely with unpredictable results". There is absolutely no requirement to document what those "unpredictable results" might be.
The standard joke is that one possible consequence of undefined behavior is making demons fly out of your nose -- not because that's actually possible, but because actually making demons fly out of your nose would not violate the standard. That's equally true in C89 and all later versions of C. The change in C99 from "Permissible" to "Possible" made no difference. The phrase "imposes no requirements" has always meant exactly that. (And if you think the change from "Permissible" to "Posisble" is semantically significant, then it would have been a recognition of what was already accepted.)
The idea of "nasal demons" goes back to 1992. http://catb.org/jargon/html/N/nasal-demons.html