When vectors misbehave

Mischeif with vectors

My go-to data structure to use from the C++ standard library is the std::vector. Most problems are generally handled with good use of this container. But this is C++. Because its C++, things don’t always behave the way we assume they might at a high level. In this section, we’ll go over how certain types, used in conjunction with a vector, may act differently than one might expect at first.

So let’s dig in and start off with some code:

#include <vector>
#include <iostream>
#include <string>


template<typename T>
void try_replace_with_auto(T first, T second, bool show_type)
{
    // We create an empty vector and add the first value
    std::vector<T> values;
    values.push_back(first);

    std::cout << "Before - values[0]: " << values[0] << " | ";

    // We then point at the first value using auto and then replace it with the second value
    auto value = values[0];
    value = second;
    
    std::cout << "After - values[0]: " << values[0] << std::endl;
    
    if (show_type) {
        std::cout << "type of value: " << typeid(value).name() << std::endl;
    }
}

In the above code block, we have a function called try_replace_with_auto that takes 2 values. We are really just trying to take the first value, push it into a vector, then get the first value using auto and then replace it with another value.

What would you expect the function would do if we provide it first = 5 and second = 10?

If you guessed it would print 5 twice, you’d be right! We use auto to capture the value, and it ends up capturing by value, so the assignment with the second value doesn’t persist in the vector.

    auto value = values[0];
    // is going to behave as if we did
    T value = values[0]; // COPY!

Let’s confirm that with code:

// Lets start off with some integers, shall we?
try_replace_with_auto(5, 10, false);
Before - values[0]: 5 | After - values[0]: 5

Great! Just as we expected. The second value isn’t persisted into the vector. Lets try some more examples then, shall we?

// What about if we tried characters?
try_replace_with_auto("a", "b", false);
Before - values[0]: a | After - values[0]: a

Our sanity still checks out. Just prints a twice. Lets try some floating point numbers!

try_replace_with_auto(6.0, 12.0, false);
Before - values[0]: 6 | After - values[0]: 6

Still going strong! What other primitive types are there? Shall we try a bool?

// What about this one?
try_replace_with_auto(true, false, false);
Before - values[0]: 1 | After - values[0]: 0

Woah, wait a second. The second value shouldn’t have persisted. What happened?

Maybe we can inspect the type of the value and see what is going on. Our 3rd bool arg will print the type of our value so that we can see what might be going on.

try_replace_with_auto(5, 10, true);
try_replace_with_auto("a", "b", true);
try_replace_with_auto(6.0, 12.0, true);
Before - values[0]: 5 | After - values[0]: 5
type of value: i
Before - values[0]: a | After - values[0]: a
type of value: PKc
Before - values[0]: 6 | After - values[0]: 6
type of value: d

Hmm, all of them seem like value types for the first 3 types we tried. What about for the bool then?

try_replace_with_auto(true, false, true);
Before - values[0]: 1 | After - values[0]: 0
type of value: St14_Bit_reference

Oh. Very interesting. This isn’t a regular bool type. This looks like it’s a reference to a bool? But we never told it to take a reference. What happened here?

It turns out that vector has a specialzation for the bool type. You’ll be able to confirm this by checking out the Classes section at: https://en.cppreference.com/w/cpp/header/vector.

vector<bool> | space-efficient dynamic bitset (class template specialization)

Because a bool technically only needs to take up a single bit to account for all the data, we can actually go ahead and do some bit munging to save space. Which means we can stack 8 bools into a byte sanely. This also means that vector<bool> ends up being a non-confirming container - aka, this breaks a whole boat load of expected behaviours. You can’t really pass this into a C API that expects a bool array. Moreover, std::vector<bool>::reference is a typedef for struct _Bit_reference. So you don’t actually get a bool & when you do auto&. struct _Bit_reference looks something like:

typedef unsigned long _Bit_type;

struct _Bit_reference
  {
    _Bit_type * _M_p;
    _Bit_type _M_mask;

    // constructors, operators, etc...

    operator bool() const
    { return !!(*_M_p & _M_mask); }
  };

Some companies ban using std::vector<bool> in the code base because this can be a slightly tricky thing to keep in mind / not hurt yourself with. I don’t know if its worth going as far as banning usage (unless you are working in a safety critical codebase), but it is something worth being aware of and avoid it if you are feeling weary of using it.