Arrays

In this series, we are going to delve into the basic containers that the standard library offers us. But we have to start somewhere and where else to start but the most basic of the containers: the std::array.

In order to build an understanding, we will contract a bit with another container available in the standard library: std::vector (probably the most popular one).

No one was ever fired for using a vector

When in doubt, the container to whip out if definately std::vector. But we can’t always just settle for what is most likely to be good. We care about what’s best. In this section, we will talk about times when we might actually want to consider dropping down to a std::array.

The standard library conveniently provides us with a thin wrapper around a C-style array known as std::array.

Before we dive too far into std::array, lets talk about std::vector and what it looks like:

#include <vector>
#include <iostream>
// You could probably also do this with integer sizes instead of all pointers
template <typename T>
struct SimpleVector
{
    T* begin_;
    T* end_;
    T* end_capacity_;
};
std::cout << "Size of empty SimpleVector<int>: " << sizeof(SimpleVector<int>{}) << std::endl;
std::cout << "Size of empty vector<int>: " << sizeof(std::vector<int>{}) << std::endl;
Size of empty SimpleVector<int>: 24
Size of empty vector<int>: 24

An std::array doesn’t differentiate between a partially populated container and a fully populated container. As a result, off the bat, we save on the storage used for capacity.

Furthermore, the data housed in a std::vector is stored on the heap, but with an std::array, it can be on the stack. Accessing the heap memory requires an allocation and this can be expensive. In contract, allocating stack memory involves adjusting the stack pointer - this is done once upon entry into the function. Moreover, this avoids memory fragmentation. (Note: You can technically allocate an std::array on the heap).

#include <vector>
#include <array>

constexpr size_t num_values = 10000;

static void bench_vector_push_back(benchmark::State& state) {
  for (auto _ : state) {
    std::vector<int> result;
    
    for (int i = 0; i < num_values; ++i) {
        result.push_back(i * i);
    }
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(bench_vector_push_back);

static void bench_vector_reserve_and_push_back(benchmark::State& state) {
  for (auto _ : state) {
    std::vector<int> result;
    result.reserve(num_values);

    for (int i = 0; i < num_values; ++i) {
        result.push_back(i * i);
    }
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(bench_vector_reserve_and_push_back);

static void bench_vector_prealloc(benchmark::State& state) {
  for (auto _ : state) {
    std::vector<int> result(num_values);

    for (int i = 0; i < num_values; ++i) {
        result[i] = i * i;
    }
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(bench_vector_prealloc);

static void bench_array(benchmark::State& state) {
  for (auto _ : state) {
    std::array<int, num_values> result;

    for (int i = 0; i < num_values; ++i) {
        result[i] = i * i;
    }
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(bench_array);

If you want to play around with the benchmark yourself, check out: Quick Bench - Simple - Array vs. Vector.

The generated benchmarks from the above looks something like: alt text

There are several advantages to using a std::array:

  • The compiler will reserve the space needed at compile time

  • The object is stored on the stack, which means no messy heaps involved

  • There is no extra overhead aside from the side of the underlying data

There is a few pre-requisites to using it however:

  • We must know the size at compile time (it is expected as a template type) - template <class T, std::size_t N> struct array;

  • Cannot be resized at any point

Note

If you declare an std::array locally, the size of the object is limited by the size of the stack. The only way you can get around this is by declaring it in the file scope or as a static variable. The reason this wroks is that the static variables end up living in a static heap. (If you are getting to a point where you are pushing the limits on the size of the std::array, its probably worth re-evaluating if there is a better way to approach the problem.)