From Russ Cox

Lumping both non-portable and buggy code into the same category was a mistake. As time has gone on, the way compilers treat undefined behavior has led to more and more unexpectedly broken programs, to the point where it is becoming difficult to tell whether any program will compile to the meaning in the original source. This post looks at a few examples and then tries to make some general observations. In particular, today’s C and C++ prioritize performance to the clear detriment of correctness.

I am not claiming that anything should change about C and C++. I just want people to recognize that the current versions of these sacrifice correctness for performance. To some extent, all languages do this: there is almost always a tradeoff between performance and slower, safer implementations. Go has data races in part for performance reasons: we could have done everything by message copying or with a single global lock instead, but the performance wins of shared memory were too large to pass up. For C and C++, though, it seems no performance win is too small to trade against correctness.

  • mo_ztt ✅@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    5
    ·
    edit-2
    1 year ago

    I’m definitely open to the idea that C and C++ have problems, but the things listed in this article aren’t them. He lists some very weird behavior by the clang compiler, and then blames it on C despite the fact that in my mind they’re clearly misfeatures of clang. He talks about uncertainty of arithmetic overflow… unless I’ve missed something, every chip architecture that 99% of programmers will ever encounter uses two’s complement, so the undefined behavior he talks about is in practice defined.

    He says:

    But not all systems where C and C++ run have hardware memory protection. For example, I wrote my first C and C++ programs using Turbo C on an MS-DOS system. Reading or writing a null pointer did not cause any kind of fault: the program just touched the memory at location zero and kept running. The correctness of my code improved dramatically when I moved to a Unix system that made those programs crash at the moment of the mistake. Because the behavior is non-portable, though, dereferencing a null pointer is undefined behavior.

    This is the same thing. He’s taking something that’s been a non-issue in practice for decades and deciding it’s an issue again. Yes, programming in C has some huge and unnecessary difficulties on non-memory-protected systems. The next time I’m working on that MS-DOS project, I’ll be sure to do it in Python to avoid those difficulties. OH WAIT

    Etc etc. C++ actually has enough big flaws to fill an essay ten times this long about things that cause active pain to working programmers every day… but no, we’re unhappy that arithmetic overflow depends on the machine’s reliably-predictable behavior, instead of being written into the C standard regardless overriding the machine architecture. It just seems like a very weird and esoteric list of things to complain about.

    Edit: Actually, I thought about it, and I don’t think clang’s behavior is wrong in the examples he cites. Basically, you’re using an uninitialized variable, and choosing to use compiler settings which make that legal, and the compiler is saying “Okay, you didn’t give me a value for this variable, so I’m just going to pick one that’s convenient for me and do my optimizations according to the value I picked.” Is that the best thing for it to do? Maybe not; it certainly violates the principle of least surprise. But, it’s hard for me to say it’s the compiler’s fault that you constructed a program that does something surprising when uninitialized variables you’re using happen to have certain values.

    • Sonotsugipaa@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 year ago

      Of all the things the article could have used to make its point, they should have mentioned the issue of type punning through type aliasing (fancy words for "reinterpret_cast from uint32_t* to std::float32_t* "), which is something that can realistically lead to incredibly sneaky bugs with all popuplar compilers.

      • mo_ztt ✅@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        1 year ago

        I could talk for a long time about things I don’t like about C++. This type of stuff doesn’t even scratch the surface lol.

        Years and years ago I actually wrote up a pretty in-depth blog post talking about it, even going so far as to show that it’s not even faster than the competitors once you’ve added in all this overbloated garbage that it calls a standard library. I wrote up a little toy implementation of some problem in C, Python, C++, and a couple other languages, and lo and behold then C one was faster by a mile and the C++ one using all the easier C++ abstractions was pretty comparable with the others and actually slower than the Perl implementation.

    • metiulekm@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Edit: Actually, I thought about it, and I don’t think clang’s behavior is wrong in the examples he cites. Basically, you’re using an uninitialized variable, and choosing to use compiler settings which make that legal, and the compiler is saying “Okay, you didn’t give me a value for this variable, so I’m just going to pick one that’s convenient for me and do my optimizations according to the value I picked.” Is that the best thing for it to do? Maybe not; it certainly violates the principle of least surprise. But, it’s hard for me to say it’s the compiler’s fault that you constructed a program that does something surprising when uninitialized variables you’re using happen to have certain values.

      You got it correct in this edit. But the important part is that gcc will also do this, and they both are kinda expected to do so. The article cites some standard committee discussions: somebody suggested ensuring that signed integer overflow in C++20 will not UB, and the committee decided against it. Also, somebody suggested not allowing to optimize out the infinite loops like 13 years ago, and then the committee decided that it should be allowed. Therefore, these optimisations are clearly seen as features.

      And these are not theoretical issues by any means, there has been this vulnerability in the kernel for instance: https://lwn.net/Articles/342330/ which happened because the compiler just removed a null pointer check.

  • mrkite@programming.dev
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    1 year ago

    My problem with C/C++ is the people behind the spec have sacrificed our sanity in the name of “compiler optimization”. Signed overflow behaves the same on every cpu on the planet, why is it undefined behaviour? Even more insane, they specify intN_t must be implemented via 2s complement… but signed overflow is still undefined because compilers want to pretend they run on pixie dust instead of real hardware.