From Russ Cox

Lumping both non-portable and buggy code into the same category was a mistake. As time has gone on, the way compilers treat undefined behavior has led to more and more unexpectedly broken programs, to the point where it is becoming difficult to tell whether any program will compile to the meaning in the original source. This post looks at a few examples and then tries to make some general observations. In particular, today’s C and C++ prioritize performance to the clear detriment of correctness.

I am not claiming that anything should change about C and C++. I just want people to recognize that the current versions of these sacrifice correctness for performance. To some extent, all languages do this: there is almost always a tradeoff between performance and slower, safer implementations. Go has data races in part for performance reasons: we could have done everything by message copying or with a single global lock instead, but the performance wins of shared memory were too large to pass up. For C and C++, though, it seems no performance win is too small to trade against correctness.

    1 year ago

    Edit: Actually, I thought about it, and I don’t think clang’s behavior is wrong in the examples he cites. Basically, you’re using an uninitialized variable, and choosing to use compiler settings which make that legal, and the compiler is saying “Okay, you didn’t give me a value for this variable, so I’m just going to pick one that’s convenient for me and do my optimizations according to the value I picked.” Is that the best thing for it to do? Maybe not; it certainly violates the principle of least surprise. But, it’s hard for me to say it’s the compiler’s fault that you constructed a program that does something surprising when uninitialized variables you’re using happen to have certain values.

    You got it correct in this edit. But the important part is that gcc will also do this, and they both are kinda expected to do so. The article cites some standard committee discussions: somebody suggested ensuring that signed integer overflow in C++20 will not UB, and the committee decided against it. Also, somebody suggested not allowing to optimize out the infinite loops like 13 years ago, and then the committee decided that it should be allowed. Therefore, these optimisations are clearly seen as features.

    And these are not theoretical issues by any means, there has been this vulnerability in the kernel for instance: which happened because the compiler just removed a null pointer check.