Casting

Why cast?

Casting is a means of converting one object type to another. Objects, classes, and structs really only have meaning to us as readers and writers of code. It doesn’t really mean much to a computer. Everything really just is data on your computer laid out in memory. An integer object is really just a place in memory that houses some integer value. If we want, we can pretend it’s a float or a double - it doesn’t necessarily mean that it’s right or that the change in meaning is meaningful. Casting is quite useful and especially shines when it comes to casting between different inheritance levels. Lets look at some really simple examples.

Lets say there exists a function that takes an integer (specifically a int64_t):

#include <iostream>
#include <cstdint>

void print_integer_value(const int64_t in)
{
    std::cout << "Provided input value: " << in << std::endl;
}

If we happen to have a double at our disposal, lets say 12345.678 and we really only care about the integer component of it we should be able to reasonably use this method. But if we go and try to use it with a double, we’ll run into some trouble (har har har).

print_integer_value(12345.678);

will cause the compiler to warn us with:

input_line_11:2:22: warning: implicit conversion from 'double' to 'int64_t' (aka 'long long') changes value from 12345.678 to 12345 [-Wliteral-conversion]

and for good reason! In normal times, we probably don’t want to do this and the compiler should warn us that we might be doing this accidentially. But if we wanted to do it anyway, how do we shush the compiler? That’s where casting comes into place:

print_integer_value(static_cast<int64_t>(12345.678));
print_integer_value(static_cast<int64_t>(-12345.678));
Provided input value: 12345
Provided input value: -12345

While we do it here, sometimes the casting operation, especially with integers is not always safe. Especially when you are dealing with extreme values. For example, we can shoot ourselves in the foot by doing:

#include <limits>

const uint32_t my_large_num = std::numeric_limits<uint32_t>::max();
std::cout << "My large number: " << my_large_num << std::endl;

const int16_t my_smaller_num = static_cast<int16_t>(my_large_num);
std::cout << "Uh oh, what have we done? We got: " << my_smaller_num << std::endl;
My large number: 4294967295
Uh oh, what have we done? We got: -1

Had we not shushed the compiler by using a casting operator, we would have been warned that this might be a bad idea.

const int16_t my_smaller_num = static_cast<int16_t>(my_large_num);

will result in compiler warnings of:

input_line_15:2:35: warning: implicit conversion from 'const uint32_t' (aka 'const unsigned int') to 'const int16_t' (aka 'const short') changes value from 4294967295 to -1 [-Wconstant-conversion]

Note

For some cases, the compiler will cast a type for you without you explicitly saying so. For example: uint8_t my_small_val = 0; uint16_t my_medium_val = my_small_val; // totally cool - the compiler won’t say a peep

Maybe casting integers seems a bit silly and you say you’d never do such a thing. Sure, lets assume this is true. There are still other good reasons to use this. My favorite is when you are casting up and down an inheritance chain.

Lets show you what we mean:

// Lets say we have a simple base class like so:
struct User
{
    User() = default;
    User(uint64_t i, uint64_t a, uint64_t n) : id(i), age(a), number_of_signins(n) {}
    virtual ~User() = default;
    
    uint64_t id;
    uint64_t age;
    uint64_t number_of_signins;
};

// Lets go ahead and create a struct that inherits from our base.
struct PayingUser : User
{
    uint64_t amount_paid;
};

Lets also say we have a function that takes User and prints its values:

void print_user_data(const User& user)
{
    std::cout << "Userid: " << user.id << ", Age: " << user.age << ", Signins: " << user.number_of_signins << std::endl;
}

Lets go ahead and create an instance of our PayingUser:

PayingUser user;
user.id = 0;
user.age = 25;
user.number_of_signins = 0;
user.amount_paid = 10;

So, cool - we have an instance of our paying user. But how do we print the user contents with our print_user_data function? Maybe we can use this casting this we just heard about here:

print_user_data(static_cast<User&>(user));
Userid: 0, Age: 25, Signins: 0

And that seems to work just fine! While this works, we aren’t properly using casting here. We mentioned earlier how the compiler will cast a type for you (when it is safe to do so) like the uint8_t -> uint16_t case. In this case, it’s actually safe to implicitly cast PayingUser to User:

print_user_data(user);
Userid: 0, Age: 25, Signins: 0

Before we go any further, lets talk about the why. Why is this safe for the compiler to just do on your behalf? Well, lets step back and talk about the uint8_t to uint16_t case. The ranges of uint8_t and uint16_t are:

std::cout << "Range of uint8_t - [" << static_cast<uint32_t>(std::numeric_limits<uint8_t>::min()) << ", " << static_cast<uint32_t>(std::numeric_limits<uint8_t>::max()) << "]" << std::endl;
std::cout << "Range of uint16_t - [" << static_cast<uint32_t>(std::numeric_limits<uint16_t>::min()) << ", " << static_cast<uint32_t>(std::numeric_limits<uint16_t>::max()) << "]" << std::endl;
Range of uint8_t - [0, 255]
Range of uint16_t - [0, 65535]

Note

Feel free to ignore the cast operation to uint32_t. uint8_t doesn’t print properly when using the streams (the number won’t render proerly - it’ll instead try to render it as a char and it won’t make any sense for our purposes)

Are there any values of uint8_t that cannot be represented by uint16_t? It doesn’t seem like it. It’s a super set of the former. So if we take any uint8_t type and tell the computer to pretend it’s a uint16_t size, everything is fine because everything is still representable. The other way around, however, is not safe. We can potentially lose resolution and therefore convert incorrectly (for any value > 255 in this case).

What about the PayingUser and User case? Well, lets take a look at the memory layout of our structures. We mentioned earlier that everything is really just data laid out in memory. So when we say we have an object of type User, we actually have something that points to some data. Something like this:

                     Memory address
                    |     0x0      |
                    |     0x8      | (each block is 8 bytes)
                    |     ...      |
                          ...
                          
user (0x1234) ----> |     id       | 0x1234
                    |     age      | 0x123C
                    |  num signins | 0x1244
                    |  amount paid | 0x124C
                    
                          ...
                          ...

All user is storing is the memory location in which we are storing the contents of the PayingUser object. If we ever want to modify the id of the user object, we are really just going to that memory location and changing whatever number is in there to our new number.

                     Memory address
                    |     0x0      |
                    |     0x8      | (each block is 8 bytes)
                    |     ...      |
                          ...
                          
                    |     id       | 0x1234 --- --------------------------------------
                    |     age      | 0x123C     > Needed to store the User object     \
                    |  num signins | 0x1244 ---                                        > Is what we need to store the Paying User object
                    |  amount paid | 0x124C ------------------------------------------
                    
                          ...
                          ...

Whats important to note is that all objects point to the start of their respective type. The first 3 8 byte sections of a PayingUser is really just the User bit. Which means we can just tell the compiler to just forget about the rest and to instead create an object (temporary potentially) that is of type User and starts at memory address 0x1234. At the end of the day, its all just data somewhere. It just so happens that in this case, someone can forget that amount paid bit exists and no harm is done.

To show you that via code - the memory addresses of the user object.

std::cout << "Addresses of the paying user instance: " << std::endl;
std::cout << "\t" << static_cast<void*>(&user.id) << std::endl;
std::cout << "\t" << static_cast<void*>(&user.age) << std::endl;
std::cout << "\t" << static_cast<void*>(&user.number_of_signins) << std::endl;
std::cout << "\t" << static_cast<void*>(&user.amount_paid) << std::endl;

auto& user_base_attributes = static_cast<User&>(user);
std::cout << "Addresses for the user base instance: " << std::endl;
std::cout << "\t" << static_cast<void*>(&user_base_attributes.id) << std::endl;
std::cout << "\t" << static_cast<void*>(&user_base_attributes.age) << std::endl;
std::cout << "\t" << static_cast<void*>(&user_base_attributes.number_of_signins) << std::endl;
// If we uncomment the next line, the compiler will complain that it doesn't know what this member named `amount_paid` is
// This is because we are telling the compuer that this is an object of type User. User doesn't have an amount_paid member
// so the compiler can't possibly let us access that thing!
// std::cout << "\t" << static_cast<void*>(&user_base_attributes.amount_paid) << std::endl;
Addresses of the paying user instance: 
	0x7fc7f0aec030
	0x7fc7f0aec038
	0x7fc7f0aec040
	0x7fc7f0aec048
Addresses for the user base instance: 
	0x7fc7f0aec030
	0x7fc7f0aec038
	0x7fc7f0aec040

So going from PayingUser to User doesn’t lead to any invalid information, incorrect access, or overreach of context. We are only decreasing in scope and access here. This is why the compiler can just do it for you. It’s safe.

But what if we wanted to go the other way around? What if we had an instance of User and we wanted to get an instance of PayingUser? Well, this is not so safe anymore. For example, lets say someone created an instance of the User type (unlike earlier where we created an instance of PayingUser):

User nonpaying_user{1, 30, 2};

If we looked at the memory of this object, it would look something like:

                     Memory address
                    |     0x0      |
                    |     0x8      | (each block is 8 bytes)
                    |     ...      |
                          ...
                          
user (0x1000) ----> |     1        | id -          0x1000 -------
                    |     30       | age -         0x1008         > this is the only section that we care about!
                    |     2        | num signins - 0x1010 -------
                    |  garbage!!!! | amount paid - 0x1018 (Anything could be at this memory location!)
                    
                          ...
                          ...

At memory location 0x1018 there could literally be anything in there. Maybe old data that hasn’t been reset yet. Maybe it belongs to a different object entirely. Who knows!

If we wanted to get an instance of PayingUser from this User object, it’s not so sane anymore. If we blindly cast, we could be reading garbage when we try and read amount_paid! Lets show you what we mean:

// Lets unsafely cast the non-paying instance to a paying one
auto& totally_not_safe_paying_user = static_cast<PayingUser&>(nonpaying_user);

std::cout << "Id: " << totally_not_safe_paying_user.id << std::endl;
std::cout << "Accessing amount paid: " << totally_not_safe_paying_user.amount_paid << std::endl;
Id: 1
Accessing amount paid: 140496713203800

Sometimes the value above is 0 and other times you’ll see random numbers. This is undefined behavior and something to watch out for. The cast from a User to a PayingUser is unsafe. Unless we know for sure that it used to be a PayingUser that was converted down to a User, we are going to be running into trouble.

In this particular case, it’s not always clear or safe when it is that you can go from a User object to a PayingUser object. One of the cases in which we would like to do something like this while knowing for sure that our actions are safe is when we are doing CRTP.

We would recommend checking out CRTP for more info, but we’ll go over a quick example here:

template <typename Derived> // We take in the Derived type here so that we can context that we can use
struct BaseType
{
    void some_base_func()
    {
        std::cout << "Someone called some_base_func!" << std::endl;
        
        static_cast<Derived*>(this)->some_derived_func(value); // We can now cast up to the derived type since we know what it is!
    }
    
    uint32_t value;
};

struct DerivedType : BaseType<DerivedType>
{
    using BaseType<DerivedType>::some_base_func;
    
    DerivedType(uint32_t value) : BaseType<DerivedType>{value} {}

    void some_derived_func(uint32_t value)
    {
        std::cout << "Derived method: some_derived_func was called with value: " << value << std::endl;
    }
};

auto derived = DerivedType(123);
// lets create an instance of base instead
BaseType<DerivedType> base = derived;
// now lets call the base method
base.some_base_func();
Someone called some_base_func!
Derived method: some_derived_func was called with value: 123

Even though we were in the base, we were able to cast up to the dervived type (safely) and call a method with escalated privilages!

Casting in other ways

So far, we’ve only talked about static_cast and implicit casting that the compiler does for you. C++ provides us with other ways of casting as well!

static_cast

We started with static_cast because it is the most restrictive casting operation. It only lets you cast between objects that are related in some sense. This means we can go from a uint8_t to a uint16_t (and vice versa) or some a PayingUser to a User (and vice versa). We cannot convert between an integer and a string however. There isn’t really an upfront and clear way to go about it. They are also not immediately related in all cases. Similarly, we can’t just go from a uint8_t to a User. They have nothing to do with each other.

The only exception to this relationship is when you want to cast to and from void*. Anything is fair game when it comes to void*. The reason for this is that void* strips away all context. It’s just a pure pointer. An instance of an object is really pointing to some memory address. This is why we can get the pointer to that memory location. Casting that pointer to a void* just says, “hey, forget about everything you know about this memory location when it comes to types. Just remember the memory address and thats it.”.

Note

You may sometimes see folks doing (void*)(&obj). This is the old c-style casting. We recommend moving towards the explicit cast operators that C++ provides for readability and safety.

This is the most restrictive and should generally be preferred first. If you want to read more, check out: static_cast.

reinterpret_cast

We keep coming back to this idea that its all just bytes hanging out somewhere. reinterpret_cast takes that to the literal sense. It is a way of telling the computer, “hey, trust me - just take this blob of data and pretend its this other blob of data and I promose its all good!”. Just like how with great power comes great responsibility, it is was too easy for someone to mess up when doing this. Things stop making as much sense if you start casting random objects to other random object types!

Lets do a quick example:

struct FourSeperateBytes
{
    uint8_t first;
    uint8_t second;
    uint8_t third;
    uint8_t fourth;
};

// Lets say someone gives us a uint64_t number like:
uint64_t my_value = 67305985;
// in binary, this is:
// 00000100 00000011 00000010 00000001
//     4       3         2       1

// If we wanted to split out the number into the first, second, third, and fourth byte regions, we can just do:
auto seperated = reinterpret_cast<FourSeperateBytes&>(my_value);

std::cout << "First: " << static_cast<uint32_t>(seperated.first) << std::endl;
std::cout << "Second: " << static_cast<uint32_t>(seperated.second) << std::endl;
std::cout << "Third: " << static_cast<uint32_t>(seperated.third) << std::endl;
std::cout << "Fourth: " << static_cast<uint32_t>(seperated.fourth) << std::endl;
First: 1
Second: 2
Third: 3
Fourth: 4

By using reinterpret_cast we were able to tell the compiler to just pretend the uint64_t type that we had it really an object of type FourSeperateBytes. This lets us access into the byte offsets and just use it directly. Isn’t that neat?

It’s important to keep in mind that with reinterpret_cast, all bets are off. The compiler doesn’t warn you when you are about to do something wrong.

// Unlike in the previous example, if we did that with something that wasn't as appropriate:
uint8_t my_small_value = 5;

auto small_seperated = reinterpret_cast<FourSeperateBytes&>(my_small_value);

std::cout << "First: " << static_cast<uint32_t>(small_seperated.first) << std::endl;
std::cout << "Second: " << static_cast<uint32_t>(small_seperated.second) << std::endl;
std::cout << "Third: " << static_cast<uint32_t>(small_seperated.third) << std::endl;
std::cout << "Fourth: " << static_cast<uint32_t>(small_seperated.fourth) << std::endl;
First: 5
Second: 0
Third: 0
Fourth: 0

Lets us read into area we aren’t really supposed to be. Again, these values are going to be garbage. Sometimes they end up being 0 initialized, but not always. This is undefined behavior so watch out for these things!

Just like how there is a time and place for everything, this too has its purpose. If you want to read more, check out: reinterpret_cast.

const_cast

Just like how reinterpret_cast gives you a lot of power that you ought to think twice about using, this too is one of them. This lets us cast away the constness of something.

In c++, we have the attribute const that lets us tell the compiler, “please make sure no one modifies this!”. const_cast lets you tell them compiler it’s okay when you decide to change it anyway!

uint64_t my_unmodifiable_variable = 24;
const uint64_t& ref_to_unmodifiable = my_unmodifiable_variable;
// If we uncomment the next line, the compiler (rightfully so) will throw errors saying:
//     error: cannot assign to variable 'ref_to_unmodifiable' with const-qualified type 'const uint64_t &' (aka 'const unsigned long long &')
// ref_to_unmodifiable = 10;

// But what if we want to do it anyway? const_cast to the rescue! (... or to cause mayham)
std::cout << "Before: " << my_unmodifiable_variable << std::endl;

const_cast<uint64_t&>(ref_to_unmodifiable) = 12345;

std::cout << "After: " << my_unmodifiable_variable << std::endl;
Before: 24
After: 12345

If you want to read more, check out: const_cast

dynamic_cast

“Safely converts pointers and references to classes up, down, and sideways along the inheritance hierarchy.” - CppReference

Earlier, we mentioned how static_casting a base User type to a PayingUser type (which it was never originally a PayingUser type can be a problem. static_cast can’t protect us from this. But what if we wanted to know if you could, and only then do the cast? That’s one place where dynamic_cast comes in.

if (PayingUser* paying_user = dynamic_cast<PayingUser*>(&nonpaying_user))
{
    std::cout << "This was a paying user afterall - safe to do so!" << std::endl;
}
else
{
    std::cout << "Nope, never was a paying user type!" << std::endl;
}
Nope, never was a paying user type!

You might be wondering two things at this point.

  1. How in the world does dynamic_cast know is someone was of a certain type or not?

  2. Why doesn’t static_cast also do this?

The reason why static_cast doesn’t also do this is because dynamic_cast incurs a runtime cost to do this check. static_cast happens at compile time, so it has no chance to check this. But what is this runtime check it’s doing? and where is the data for it? In the vtable!

When we created the User class, we gave it a virtual destructor. This sets up a vtable that hangs on to this information. We don’t expect that to mean much at this point, which is why we have a full section dedicated to Virtuals if you want to dive deeper. Otherwise, for now, just know that some information is stashed away in this vtable (it’s how c++ knows to properly destroy itself all the way to the derived type even if you have a pointer to the base type.

The important part here is that dynamic_cast while safer has a cost. A runtime cost. If performance is really a concern, it’s generally best to avoid using this.

If you want to read more on this topic, check out:

Key takeaways

  • Everything is really just some data hanging out in some location on a computer

  • C++ offers ways of manipulating the meaning of this “data” with [static_cast, reinterpret_cast, const_cast, and dynamic_cast]

  • Always prefer the most restrictive cast first (aka, static_cast) and escalate as necessary.

  • Anything goes when it comes to reinterpret_cast so be careful!

  • Using const_cast can be really confusing to a reader because it can violate an earlier expectation of constness

  • dynamic_cast has a runtime cost!

References:

If you want to continue learning more on this subject or are curious about other resources that might help in the learning process: