Disassembling C++ Part 2 -- Objects

What is an object? This article is part two of my Disassembling C++ series.  The first one was here about overloaded functions, and mang...

Friday, June 12, 2020

Disassembling C++ Part 2 -- Objects

What is an object?


This article is part two of my Disassembling C++ series.  The first one was here about overloaded functions, and mangling.  Some of what we are going to discuss is built on that.  An object programmatically is basically a collection of data and functions that serve a similar purpose.  For example, you can have a bicycle object that you can call a function to rotate the wheels, and get the wheel count (2) from a variable or an accessor function.  There are entire books written on object inheritance, polymorphism, reflection, and other nuances of the object oriented methodology and I really don't want to rehash all of that in a five to ten minute article.  I assume that if you are here reading this, you know what an object is and all of the fun things you can do with them. I’m going to make a bold statement about objects in general from the compilers perspective though: Classes are just fancy structs.  In fact, starting with C++11 you can have a struct with functions. The only difference is that structs have everything by default public whereas a class is by default private. We are going to do some more C/C++ namespace hacking in this article too, so if you enjoyed that in the last one, more is coming.

Objects only hold data


I know that most of us are used to the idea of a class (or object) as being a mysterious collection of data and functions that we can create, copy, destroy, extend, or use in a polymorphic fashion.  The truth is that the functions inside a class are just regular functions mangled by the compiler, and the class is only a pointer to the data inside of it. Let’s take this simple object here:


Now, if we compile this and look at the symbols exported, we see the function and a reference to printf in C linkage.

Notice that the function, even though private is still defined in the file.  That is all that it is, a function, and as such can be called from regular old C.  So, would this work?

Let's find out:

Kablammo!  But there is a very good reason for that.  Remember how I said an object is just a pointer to some data?  One thing the compiler does under the hood is for every object function call, there is a hidden first parameter that is the object's address to represent “this”.  Now our example object here has two ints as it data elements. So let's try an experiment.


Here since we know what the object’s data looks like, we create a C struct that is identical.  We also pass the address of the struct as the sole parameter to a function declared inside of an object to have none.

This really works!  Try it yourself. Also notice that I never linked against the libstdc++ library, it’s all just the standard C library using gcc.  From C, we were able to create a faux object, and call a real objects functions against it. Also, remember the definition of our function here?  Its private. All of that stuff is just for us, as people to help organize data. We could also create an inherited class from this one and it would still work, that is a compiler feature too.

Next up, language features we take for granted that are part of the c++ library itself.  These cool hacks which show how data is organized work best on simple objects. When you get into more complex situations, things get bigger and more complicated.  However, it all boils down to the simple things. You are calling a function with a hidden first parameter as a pointer to the objects “this” parameter for regular functions, and for static functions, you are literally just calling a function.  The rest of object oriented methodology is simply determining what function to call and the appropriate address to pass to it.  The bottom line, your class can be as complex as you want, but it will still only use up as much data as you have defined.  A class with two ints in it will use up 8 bytes of memory whether it has 2 functions or 200.