Casting¶
Why cast?¶
Casting is a means of converting one object type to another. Objects, classes, and structs really only have meaning to us as readers and writers of code. It doesn’t really mean much to a computer. Everything really just is data on your computer laid out in memory. An integer object is really just a place in memory that houses some integer value. If we want, we can pretend it’s a float or a double - it doesn’t necessarily mean that it’s right or that the change in meaning is meaningful. Casting is quite useful and especially shines when it comes to casting between different inheritance levels. Lets look at some really simple examples.
Lets say there exists a function that takes an integer (specifically a int64_t
):
#include <iostream>
#include <cstdint>
void print_integer_value(const int64_t in)
{
std::cout << "Provided input value: " << in << std::endl;
}
If we happen to have a double
at our disposal, lets say 12345.678 and we really only care about the integer component of it we should be able to reasonably use this method. But if we go and try to use it with a double
, we’ll run into some trouble
(har har har).
print_integer_value(12345.678);
will cause the compiler to warn us with:
input_line_11:2:22: warning: implicit conversion from 'double' to 'int64_t' (aka 'long long') changes value from 12345.678 to 12345 [-Wliteral-conversion]
and for good reason! In normal times, we probably don’t want to do this and the compiler should warn us that we might be doing this accidentially. But if we wanted to do it anyway, how do we shush the compiler? That’s where casting comes into place:
print_integer_value(static_cast<int64_t>(12345.678));
print_integer_value(static_cast<int64_t>(-12345.678));
Provided input value: 12345
Provided input value: -12345
While we do it here, sometimes the casting operation, especially with integers is not always safe. Especially when you are dealing with extreme values. For example, we can shoot ourselves in the foot by doing:
#include <limits>
const uint32_t my_large_num = std::numeric_limits<uint32_t>::max();
std::cout << "My large number: " << my_large_num << std::endl;
const int16_t my_smaller_num = static_cast<int16_t>(my_large_num);
std::cout << "Uh oh, what have we done? We got: " << my_smaller_num << std::endl;
My large number: 4294967295
Uh oh, what have we done? We got: -1
Had we not shushed the compiler by using a casting operator, we would have been warned that this might be a bad idea.
const int16_t my_smaller_num = static_cast<int16_t>(my_large_num);
will result in compiler warnings of:
input_line_15:2:35: warning: implicit conversion from 'const uint32_t' (aka 'const unsigned int') to 'const int16_t' (aka 'const short') changes value from 4294967295 to -1 [-Wconstant-conversion]
Note
For some cases, the compiler will cast a type for you without you explicitly saying so. For example: uint8_t my_small_val = 0; uint16_t my_medium_val = my_small_val; // totally cool - the compiler won’t say a peep
Maybe casting integers seems a bit silly and you say you’d never do such a thing. Sure, lets assume this is true. There are still other good reasons to use this. My favorite is when you are casting up and down an inheritance chain.
Lets show you what we mean:
// Lets say we have a simple base class like so:
struct User
{
User() = default;
User(uint64_t i, uint64_t a, uint64_t n) : id(i), age(a), number_of_signins(n) {}
virtual ~User() = default;
uint64_t id;
uint64_t age;
uint64_t number_of_signins;
};
// Lets go ahead and create a struct that inherits from our base.
struct PayingUser : User
{
uint64_t amount_paid;
};
Lets also say we have a function that takes User
and prints its values:
void print_user_data(const User& user)
{
std::cout << "Userid: " << user.id << ", Age: " << user.age << ", Signins: " << user.number_of_signins << std::endl;
}
Lets go ahead and create an instance of our PayingUser
:
PayingUser user;
user.id = 0;
user.age = 25;
user.number_of_signins = 0;
user.amount_paid = 10;
So, cool - we have an instance of our paying user. But how do we print the user contents with our print_user_data
function? Maybe we can use this cast
ing this we just heard about here:
print_user_data(static_cast<User&>(user));
Userid: 0, Age: 25, Signins: 0
And that seems to work just fine! While this works, we aren’t properly using cast
ing here. We mentioned earlier how the compiler will cast a type for you (when it is safe to do so) like the uint8_t
-> uint16_t
case. In this case, it’s actually safe to implicitly cast PayingUser
to User
:
print_user_data(user);
Userid: 0, Age: 25, Signins: 0
Before we go any further, lets talk about the why. Why is this safe for the compiler to just do on your behalf? Well, lets step back and talk about the uint8_t
to uint16_t
case.
The ranges of uint8_t
and uint16_t
are:
std::cout << "Range of uint8_t - [" << static_cast<uint32_t>(std::numeric_limits<uint8_t>::min()) << ", " << static_cast<uint32_t>(std::numeric_limits<uint8_t>::max()) << "]" << std::endl;
std::cout << "Range of uint16_t - [" << static_cast<uint32_t>(std::numeric_limits<uint16_t>::min()) << ", " << static_cast<uint32_t>(std::numeric_limits<uint16_t>::max()) << "]" << std::endl;
Range of uint8_t - [0, 255]
Range of uint16_t - [0, 65535]
Note
Feel free to ignore the cast operation to uint32_t
. uint8_t
doesn’t print properly when using the streams (the number won’t render proerly - it’ll instead try to render it as a char and it won’t make any sense for our purposes)
Are there any values of uint8_t
that cannot be represented by uint16_t
? It doesn’t seem like it. It’s a super set of the former. So if we take any uint8_t
type and tell the computer to pretend it’s a uint16_t
size, everything is fine because everything is still representable. The other way around, however, is not safe. We can potentially lose resolution and therefore convert incorrectly (for any value > 255 in this case).
What about the PayingUser
and User
case? Well, lets take a look at the memory layout of our structures. We mentioned earlier that everything is really just data laid out in memory. So when we say we have an object of type User
, we actually have something that points to some data. Something like this:
Memory address
| 0x0 |
| 0x8 | (each block is 8 bytes)
| ... |
...
user (0x1234) ----> | id | 0x1234
| age | 0x123C
| num signins | 0x1244
| amount paid | 0x124C
...
...
All user is storing is the memory location in which we are storing the contents of the PayingUser
object. If we ever want to modify the id
of the user
object, we are really just going to that memory location and changing whatever number is in there to our new number.
Memory address
| 0x0 |
| 0x8 | (each block is 8 bytes)
| ... |
...
| id | 0x1234 --- --------------------------------------
| age | 0x123C > Needed to store the User object \
| num signins | 0x1244 --- > Is what we need to store the Paying User object
| amount paid | 0x124C ------------------------------------------
...
...
Whats important to note is that all objects point to the start of their respective type. The first 3 8 byte sections of a PayingUser
is really just the User
bit. Which means we can just tell the compiler to just forget about the rest and to instead create an object (temporary potentially) that is of type User
and starts at memory address 0x1234
. At the end of the day, its all just data somewhere. It just so happens that in this case, someone can forget that amount paid
bit exists and no harm is done.
To show you that via code - the memory addresses of the user
object.
std::cout << "Addresses of the paying user instance: " << std::endl;
std::cout << "\t" << static_cast<void*>(&user.id) << std::endl;
std::cout << "\t" << static_cast<void*>(&user.age) << std::endl;
std::cout << "\t" << static_cast<void*>(&user.number_of_signins) << std::endl;
std::cout << "\t" << static_cast<void*>(&user.amount_paid) << std::endl;
auto& user_base_attributes = static_cast<User&>(user);
std::cout << "Addresses for the user base instance: " << std::endl;
std::cout << "\t" << static_cast<void*>(&user_base_attributes.id) << std::endl;
std::cout << "\t" << static_cast<void*>(&user_base_attributes.age) << std::endl;
std::cout << "\t" << static_cast<void*>(&user_base_attributes.number_of_signins) << std::endl;
// If we uncomment the next line, the compiler will complain that it doesn't know what this member named `amount_paid` is
// This is because we are telling the compuer that this is an object of type User. User doesn't have an amount_paid member
// so the compiler can't possibly let us access that thing!
// std::cout << "\t" << static_cast<void*>(&user_base_attributes.amount_paid) << std::endl;
Addresses of the paying user instance:
0x7fc7f0aec030
0x7fc7f0aec038
0x7fc7f0aec040
0x7fc7f0aec048
Addresses for the user base instance:
0x7fc7f0aec030
0x7fc7f0aec038
0x7fc7f0aec040
So going from PayingUser
to User
doesn’t lead to any invalid information, incorrect access, or overreach of context. We are only decreasing in scope and access here. This is why the compiler can just do it for you. It’s safe.
But what if we wanted to go the other way around? What if we had an instance of User
and we wanted to get an instance of PayingUser
? Well, this is not so safe anymore. For example, lets say someone created an instance of the User
type (unlike earlier where we created an instance of PayingUser
):
User nonpaying_user{1, 30, 2};
If we looked at the memory of this object, it would look something like:
Memory address
| 0x0 |
| 0x8 | (each block is 8 bytes)
| ... |
...
user (0x1000) ----> | 1 | id - 0x1000 -------
| 30 | age - 0x1008 > this is the only section that we care about!
| 2 | num signins - 0x1010 -------
| garbage!!!! | amount paid - 0x1018 (Anything could be at this memory location!)
...
...
At memory location 0x1018
there could literally be anything in there. Maybe old data that hasn’t been reset yet. Maybe it belongs to a different object entirely. Who knows!
If we wanted to get an instance of PayingUser
from this User
object, it’s not so sane anymore. If we blindly cast, we could be reading garbage when we try and read amount_paid
! Lets show you what we mean:
// Lets unsafely cast the non-paying instance to a paying one
auto& totally_not_safe_paying_user = static_cast<PayingUser&>(nonpaying_user);
std::cout << "Id: " << totally_not_safe_paying_user.id << std::endl;
std::cout << "Accessing amount paid: " << totally_not_safe_paying_user.amount_paid << std::endl;
Id: 1
Accessing amount paid: 140496713203800
Sometimes the value above is 0 and other times you’ll see random numbers. This is undefined behavior and something to watch out for. The cast from a User
to a PayingUser
is unsafe. Unless we know for sure that it used to be a PayingUser
that was converted down to a User
, we are going to be running into trouble.
In this particular case, it’s not always clear or safe when it is that you can go from a User
object to a PayingUser
object. One of the cases in which we would like to do something like this while knowing for sure that our actions are safe is when we are doing CRTP.
We would recommend checking out CRTP for more info, but we’ll go over a quick example here:
template <typename Derived> // We take in the Derived type here so that we can context that we can use
struct BaseType
{
void some_base_func()
{
std::cout << "Someone called some_base_func!" << std::endl;
static_cast<Derived*>(this)->some_derived_func(value); // We can now cast up to the derived type since we know what it is!
}
uint32_t value;
};
struct DerivedType : BaseType<DerivedType>
{
using BaseType<DerivedType>::some_base_func;
DerivedType(uint32_t value) : BaseType<DerivedType>{value} {}
void some_derived_func(uint32_t value)
{
std::cout << "Derived method: some_derived_func was called with value: " << value << std::endl;
}
};
auto derived = DerivedType(123);
// lets create an instance of base instead
BaseType<DerivedType> base = derived;
// now lets call the base method
base.some_base_func();
Someone called some_base_func!
Derived method: some_derived_func was called with value: 123
Even though we were in the base, we were able to cast up to the dervived type (safely) and call a method with escalated privilages!
Casting in other ways¶
So far, we’ve only talked about static_cast
and implicit casting that the compiler does for you. C++ provides us with other ways of casting as well!
static_cast¶
We started with static_cast
because it is the most restrictive casting operation. It only lets you cast between objects that are related in some sense.
This means we can go from a uint8_t
to a uint16_t
(and vice versa) or some a PayingUser
to a User
(and vice versa). We cannot convert between an integer
and a string
however. There isn’t really an upfront and clear way to go about it. They are also not immediately related in all cases. Similarly, we can’t just go from a uint8_t
to a User
. They have nothing to do with each other.
The only exception to this relationship is when you want to cast to and from void*
. Anything is fair game when it comes to void*
. The reason for this is that void*
strips away all context. It’s just a pure pointer. An instance of an object is really pointing to some memory address. This is why we can get the pointer to that memory location. Casting that pointer to a void*
just says, “hey, forget about everything you know about this memory location when it comes to types. Just remember the memory address and thats it.”.
Note
You may sometimes see folks doing (void*)(&obj)
. This is the old c-style casting. We recommend moving towards the explicit cast operators that C++ provides for readability and safety.
This is the most restrictive and should generally be preferred first. If you want to read more, check out: static_cast.
reinterpret_cast¶
We keep coming back to this idea that its all just bytes hanging out somewhere. reinterpret_cast
takes that to the literal sense. It is a way of telling the computer, “hey, trust me - just take this blob of data and pretend its this other blob of data and I promose its all good!”. Just like how with great power comes great responsibility, it is was too easy for someone to mess up when doing this. Things stop making as much sense if you start casting random objects to other random object types!
Lets do a quick example:
struct FourSeperateBytes
{
uint8_t first;
uint8_t second;
uint8_t third;
uint8_t fourth;
};
// Lets say someone gives us a uint64_t number like:
uint64_t my_value = 67305985;
// in binary, this is:
// 00000100 00000011 00000010 00000001
// 4 3 2 1
// If we wanted to split out the number into the first, second, third, and fourth byte regions, we can just do:
auto seperated = reinterpret_cast<FourSeperateBytes&>(my_value);
std::cout << "First: " << static_cast<uint32_t>(seperated.first) << std::endl;
std::cout << "Second: " << static_cast<uint32_t>(seperated.second) << std::endl;
std::cout << "Third: " << static_cast<uint32_t>(seperated.third) << std::endl;
std::cout << "Fourth: " << static_cast<uint32_t>(seperated.fourth) << std::endl;
First: 1
Second: 2
Third: 3
Fourth: 4
By using reinterpret_cast
we were able to tell the compiler to just pretend the uint64_t
type that we had it really an object of type FourSeperateBytes
. This lets us access into the byte offsets and just use it directly. Isn’t that neat?
It’s important to keep in mind that with reinterpret_cast
, all bets are off. The compiler doesn’t warn you when you are about to do something wrong.
// Unlike in the previous example, if we did that with something that wasn't as appropriate:
uint8_t my_small_value = 5;
auto small_seperated = reinterpret_cast<FourSeperateBytes&>(my_small_value);
std::cout << "First: " << static_cast<uint32_t>(small_seperated.first) << std::endl;
std::cout << "Second: " << static_cast<uint32_t>(small_seperated.second) << std::endl;
std::cout << "Third: " << static_cast<uint32_t>(small_seperated.third) << std::endl;
std::cout << "Fourth: " << static_cast<uint32_t>(small_seperated.fourth) << std::endl;
First: 5
Second: 0
Third: 0
Fourth: 0
Lets us read into area we aren’t really supposed to be. Again, these values are going to be garbage. Sometimes they end up being 0 initialized, but not always. This is undefined behavior so watch out for these things!
Just like how there is a time and place for everything, this too has its purpose. If you want to read more, check out: reinterpret_cast.
const_cast¶
Just like how reinterpret_cast
gives you a lot of power that you ought to think twice about using, this too is one of them. This lets us cast away the const
ness of something.
In c++, we have the attribute const
that lets us tell the compiler, “please make sure no one modifies this!”. const_cast
lets you tell them compiler it’s okay when you decide to change it anyway!
uint64_t my_unmodifiable_variable = 24;
const uint64_t& ref_to_unmodifiable = my_unmodifiable_variable;
// If we uncomment the next line, the compiler (rightfully so) will throw errors saying:
// error: cannot assign to variable 'ref_to_unmodifiable' with const-qualified type 'const uint64_t &' (aka 'const unsigned long long &')
// ref_to_unmodifiable = 10;
// But what if we want to do it anyway? const_cast to the rescue! (... or to cause mayham)
std::cout << "Before: " << my_unmodifiable_variable << std::endl;
const_cast<uint64_t&>(ref_to_unmodifiable) = 12345;
std::cout << "After: " << my_unmodifiable_variable << std::endl;
Before: 24
After: 12345
If you want to read more, check out: const_cast
dynamic_cast¶
“Safely converts pointers and references to classes up, down, and sideways along the inheritance hierarchy.” - CppReference
Earlier, we mentioned how static_cast
ing a base User
type to a PayingUser
type (which it was never originally a PayingUser
type can be a problem. static_cast
can’t protect us from this. But what if we wanted to know if you could, and only then do the cast? That’s one place where dynamic_cast
comes in.
if (PayingUser* paying_user = dynamic_cast<PayingUser*>(&nonpaying_user))
{
std::cout << "This was a paying user afterall - safe to do so!" << std::endl;
}
else
{
std::cout << "Nope, never was a paying user type!" << std::endl;
}
Nope, never was a paying user type!
You might be wondering two things at this point.
How in the world does
dynamic_cast
know is someone was of a certain type or not?Why doesn’t
static_cast
also do this?
The reason why static_cast
doesn’t also do this is because dynamic_cast
incurs a runtime cost to do this check. static_cast
happens at compile time, so it has no chance to check this. But what is this runtime check it’s doing? and where is the data for it? In the vtable!
When we created the User
class, we gave it a virtual
destructor. This sets up a vtable that hangs on to this information. We don’t expect that to mean much at this point, which is why we have a full section dedicated to Virtuals if you want to dive deeper. Otherwise, for now, just know that some information is stashed away in this vtable (it’s how c++ knows to properly destroy itself all the way to the derived type even if you have a pointer to the base type.
The important part here is that dynamic_cast
while safer has a cost. A runtime cost. If performance is really a concern, it’s generally best to avoid using this.
If you want to read more on this topic, check out:
Key takeaways¶
Everything is really just some data hanging out in some location on a computer
C++ offers ways of manipulating the meaning of this “data” with [
static_cast
,reinterpret_cast
,const_cast
, anddynamic_cast
]Always prefer the most restrictive cast first (aka,
static_cast
) and escalate as necessary.Anything goes when it comes to
reinterpret_cast
so be careful!Using
const_cast
can be really confusing to a reader because it can violate an earlier expectation ofconst
nessdynamic_cast
has a runtime cost!
References:¶
If you want to continue learning more on this subject or are curious about other resources that might help in the learning process: