Undefined behavior in the wild

So I was hacking the code of an open source C++ project. The project builds using gcc v5.4 and runs without problems. Then, I grabbed gcc v6.2 to see how much performance improvement can it bring especially with its more updated LTO. Expectedly, the project builds without problems. However, it failed at run-time with a segfault.

Actually, such an error is usually a sign of Undefined Behavior (UB) constructs that exists in the program. Specifically, a newer compiler (version) might be stricter by rejecting programs that are perfectly acceptable by an older compiler. However, such rejection should happen at compile time and not by causing unexpected errors at run-time.

UB is a known concept in C/C++ which refers to operations that have no defined semantics in the language specifications. Therefore, compilers do not expect such operations to happen. Basically, UB specifications constitute a "contract" that the developer needs to maintain. In the case of violating this contract, the compiler is basically free to do whatever it sees fit. This includes summoning nasal demons.

In the following, I shall elaborate on this UB instance since it involves several interesting C++ concepts. My discussion, while focusing on a particular UB instance, can provide insights on the problems caused by UB in your code and how to avoid them. For more details on UB, I recommend John Regehr's blog series.

Identifying the problem

Consider the following code snippet which is a reduced version of the original UB instance. The developer created the templated method as (line 7) as a concise way to do dynamic casting across the class hierarchy. The code "works" after compiling it with gcc v5.4 or clang v4.8. Note that a nullptr check is in place (line 24) since snd_ptr is expected to be nullptr in case fst_ptr was nullptr. Interestingly, this snippet works even when compiled with gcc v6.2 with no optimizations (-O0). However, it segfaults at line 22 when compiled with gcc v6.2 at optimization levels (-O1) or higher.

#include <iostream>
class Base {
public:
  int foo;

  template<typename T>
  T* as()
  {
      return dynamic_cast<T*>(this);
  }
  virtual ~Base() = default;
};

class Derived: public Base {
public:
  int bar;
};

int main(void) {
    Derived* fst_ptr = nullptr;
    std::cout << "g++ 6.2 segfaults in the next statement" << std::endl;
    Derived* snd_ptr = fst_ptr->as<Derived>();
    std::cout << "Other compilers continue ... " << std::endl;
    return (snd_ptr == nullptr)? 0: (*snd_ptr).bar;
}

Let's now analyze this case a bit further. The root cause of the problem is setting fst_ptr to nullptr. However, this should not cause a problem locating method as since it's a non-virtual method. Remember that the compiler can emit direct calls for non-virtual methods at compile-time. However, in the case of virtual methods, it needs to locate the vtable of the object instance at run-time. This would not be possible if the pointer to the object was nullptr.

Having checked that method as is callable. We turn our attention to the dynamic cast inside it (line 9). Passing a nullptr to dynamic_cast is guaranteed to return nullptr according to the C++ standard §5.2.7/4: (thanks to this SO answer). This suggests that in the case this was nullptr, method as must return nullptr also. However, things are not that easy! Actually, setting this to nullptr is UB. It happens that gcc v6.2 did act upon this and produced a segfault instead of calling dynamic_cast on a nullptr as input. Further, adding a check like "if (this == nullptr)" before dynamic casting won't help either as compilers are free to optimize this check.

So we detected such a toy example of UB using manual analysis. This manual method might work with smaller codebases. However, it is unlikely that it can scale to larger codebases or more complex cases of UB. Fortunately, reasoning about UB has been largely automated in recent years thanks to the UB sanitizers integrated into gcc and clang. For example, we can compile the above snippet using the following command,

g++ -std=c++11 -Wall -Wextra -Wpedantic -fsanitize=undefined ub.cpp -o ub.out

Switching compiler flags to highest warning level is not critical here. However, it is a recommended practice. Basically, we only need to pass the flag -fsanitize=undefined in order to activate the UB sanitizer. Running the resulting executable ub.out, would immediately produce suitable error messages identifying the source of UB to be the call at line 22.

Possible solutions

After detecting the UB source, the question is how to fix it? Well, as discussed previously, it's the responsibility of developers to avoid UB in their code. Ideally, this means to ensure that UB never happens for any program input. In practice, however, we need to avoid unnecessary checks by ensuring that UB does not happen on "expected" inputs - at least.

Translating this to our UB case, we can either (1) insert nullptr checks before calling method as, (2) simply replace calls to method as with dynamic casts, or (3) convert method as to be static. The last option requires the least amount of change to the existing code which is a desirable property. Also, it has the advantage of maintaining the type safety of the original code. That is, method as shall accept only class Base or one of its derived classes as input. The final code after applying solution (3) will look something like the following,

#include <iostream>
class Base {
public:
  int foo;

  template<typename T>
  static T* as(Base * ptr)
  {
      return dynamic_cast<T*>(ptr);
  }
  virtual ~Base() = default;
};

class Derived: public Base {
public:
  int bar;
};

int main(void) {
    Derived* fst_ptr = nullptr;
    Derived* snd_ptr = Base::as<Derived>(fst_ptr);
    return (snd_ptr == nullptr)? 0: (*snd_ptr).bar;
}

Finally, building portable and future-proof software in C/C++ requires paying attention to UB in your code. The availability of excellent tool support in the form of UB sanitizers, among others, has made catching UB significantly easier. It's recommended to regularly activate UB sanitizers on your code in order to catch potential issues early and more often.