Bjarne Stroustrup says we must avoid linked lists

Solution 1:

Advantages of vector vs. linked list

The main advantage of vector versus linked lists is memory locality.

Usually, each element is allocated seperately in a linked list. As a consequence those elements probably are not next to each other in memory. (Gaps between the elements in memory.)

A vector is guaranteed to store all contained elements contiguously. (Items next to each other, no gaps;)

Note: Oversimplifications may occur... ;)

Imo, the simplified key points about the superior performance of a contiguously stored data storage pattern versus non-contiguous storage are

1. cache misses

Modern CPUs do not fetch single bytes from memory but slightly larger chunks. Thus, if your data objects size is less than the size of those chunks and if storage is contiguous, you can get more than one element at a time because multiple elements may be in a single chunk.

Example:

A 64byte block (usual cache line size) fits sixteen 32bit integers at a time. Therefore, a cache miss (data not in cache yet -> load from main memory required) occurs at the earliest after processing 16 elements from the moment first one has been brought to cache. If a linked list is used, the first element might well be the only one within a 64byte block. It might in theory happen that a cache miss occurs for every single element of the list.

Concrete example:

std::vector<std::uint32_t> v;
// somehow fill 64 values into v
std::uint32_t s{};
for(std::size_t i{0}; i<v.size(); ++i)
{
  s += v[i];
}

Imagine the contents of v not being cached yet.

What happens during the processing of the data in the for loop?

1)Check whether element v[0] is in cache. --> Nope

2)Fetch 64 bytes starting at the address of v[0] from main memory into a cache line

3)Load v[0] from cache and process by adding its value to s

4)Is element v1 in cache? --> Yes loaded with previous fetch because neighbouring v[0]

5)Load v1 from cache and process by adding its value to s

6)Is element v[2] in cache? --> Yes ...

7) Load v[2] from cache and process by adding its value to s

... etc...

34)Is element v[16] in cache? --> Nope

35) Fetch 64 bytes starting at the address of v[16] from main memory into a cache line

36)Load v[16] from cache and process by adding its value to s

37)Is element v[17] in cache? --> Yes loaded with previous fetch because neighbouring v[16]

etc...

Fetching data from main memory into the cache takes much more time than loading data from cache into processor registers and perform simple operations. Therefore, the fact that multiple values may reside on a single cache line can boost performance significantly.

Linked lists do not provide a contiguous storage guarantee and you cannot hope to get this performance boost. This is also the reason why random iteration (accessing elements randomly) performs worse than forward iteration (accessing elements in order) for contiguous containers.

2. prefetching

The above effect is amplified by a CPU feature called "prefetcher".

If a chunk has been loaded from main memory, the prefetcher prepares loading the next chunk / already puts it into cache, significantly reducing the penality of loading stuff from that part of the main memory.

This is of course effective if and only if you in fact need data from the next prepared chunk.

How do vectors usually work (internally)?

See: c++ Vector, what happens whenever it expands/reallocate on stack?