C++ exceptions under the hood II: a tiny ABI

Post by Nico Brailovsky @ 2013-02-12 | Permalink | Leave a comment

If we are going to try and understand why exceptions are complex and how do they work, we can either read a lot of manuals or we can try to write something to handle the exceptions ourselves. Actually, I was surprised by the lack of good information on this topic: pretty much everything I found is either incredibly detailed or very basic, with one exception or two. Of course there are some specifications to implement (most notably the ABI for c++ but we also have CFI, DWARF and libstdc) but reading the specification alone is not enough to really learn what's going on under the hood.

Let's start with the obvious then: wheel reinvention! We know for a fact that plain C doesn't handle exceptions, so let's try to link a throwing C++ program with a plain C linker and see what happens. I came up with something simple like this:

#include "throw.h"
extern "C" {
    void seppuku() {
        throw Exception();
    }
}

Don't forget the extern stuff, otherwise g++ will helpfully mangle our little function's name and we won't be able to link it with our plain C program. Of course, we need a header file to "link" (no pun intended) the C++ world with the C world:

struct Exception {};
#ifdef __cplusplus
extern "C" {
#endif
    void seppuku();
#ifdef __cplusplus
}
#endif

And a very simple main:

#include "throw.h"
int main()
{
    seppuku();
    return 0;
}

What happens now if we try to compile and link together this frankencode?

> g++ -c -o throw.o -O0 -ggdb throw.cpp
> gcc -c -o main.o -O0 -ggdb main.c

Note: You can download the full sourcecode for this project in my github repo.

So far so good. Both g++ and gcc are happy in their little world. Chaos will ensue once we try to link them, though:

> gcc main.o throw.o -o app
throw.o: In function foo()':
throw.cpp:4: undefined reference to__cxa_allocate_exception'
throw.cpp:4: undefined reference to __cxa_throw'
throw.o:(.rodata._ZTI9Exception[typeinfo for Exception]+0x0): undefined reference tovtable for __cxxabiv1::__class_type_info'
collect2: ld returned 1 exit status

And sure enough, gcc complains about missing C++ symbols. Those are very special C++ symbols, though. Check the last error line: a vtable for cxxabiv1 is missing. cxxabi, defined in libstdc++, refers to the application binary interface for C++. So now we have learned that the exception handling is done with some help of the standard C++ library with an interface defined by C++'s ABI.

The C++ ABI defines a standard binary format so we can link objects together in a single program; if we compile a .o file with two different compilers, and those compilers use a different ABI, we won't be able to link the .o objects into an application. The ABI will also define some other formats, like for example the interface to perform stack unwinding or the throwing of an exception. In this case, the ABI defines an interface (not necessarily a binary format, just an interface) between C++ and some other library in our program which will handle the stack unwinding, ie the ABI defines C++ specific stuff so it can talk to non-C++ libraries: this is what would enable exceptions thrown from other languages to be caught in C++, amongst other things.

In any case, the linker errors are pointing us to the first layer into exception handling under the hood: an interface we'll have to implement ourselves, the cxxabi. For the next article we'll be starting our own mini ABI, as defined in the C++ ABI.