Behaviour when storing std::thread in vectors

Similar questions have been asked before, however none seem to be explaining some behavior I observed. Some answers have already explained why you need to have the vector of pointers to threads (copy vs move). A few follow up questions:

  1. I'm trying to instantiate the pointers in a temp variable and push the temp variable to the vector. It fails, but works when using std::move or when putting the whole instantiation in the push_back(). Why is std::move necessary since push_back() has the const reference as the argument, i.e. what's the difference?

  2. Why does accessing threads by subscription greeters[i] throw a segmentation fault? The program runs for a few iterations and then crashes?

  3. Similar to above but with iterators. Only using back() seems to work fine. Does it have anything to do with the const reference return type? If so what?

Compile using g++ -std=c++20 -pthread -o thread thread.cpp.

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>

std::mutex m;
int cnt = 0;


void say_hello(int id)
{
    m.lock();
    cnt++;
    std::cout << cnt << " Hello from " << id << std::endl;
    m.unlock();
}

int main()
{
    int n = 20;
    std::vector<std::unique_ptr<std::thread>> greeters(n);

    for (int i = 0; i < n; i++) {
      // auto tmp = std::unique_ptr<std::thread>(new std::thread( [&i]() { return say_hello(i); } ) );
      // greeters.push_back(tmp); // error: no matching function for call to 'construct_at'

      // greeters.push_back(std::move(tmp)); // works fine
      greeters.push_back(std::unique_ptr<std::thread>(new std::thread( [&i]() { return say_hello(i); } ) ));  // works fine
    }

    for (int i = 0; i < n; i++ ) {
      auto val = std::move(greeters[i]); // throws segmentation fault
      // auto val = std::move(greeters.back()); // works fine
      // greeters.pop_back();
      val -> join();
    }

    // std::vector<std::unique_ptr<std::thread>>::iterator ptr;
    // for (ptr = greeters.begin(); ptr < greeters.end(); ptr++) {
    //   (*ptr) -> join(); // throws segmentation fault
    // }

    return 0;
}

The bug has nothing to do with vectors or threads, per se. The constructor was wrong.

std::vector<std::unique_ptr<std::thread>> greeters(n);

This creates a vector with 20 values, which will be empty unique_ptrs.

This is what this constructor does. The n parameter to std::vector's constructor specifies the initialize size of the vector, which gets default-initialized. With unique_ptrs, the default-initialization results in null unique_ptrs.

for (int i = 0; i < n; i++) {
      greeters.push_back...

This ends up push_backing twenty more unique_ptrs to the vector. It now has 40 unique_ptrs. The first twenty of them are null pointers.

    for (int i = 0; i < n; i++ ) {
      auto val = std::move(greeters[i]); // throws segmentation fault
      val -> join();
    }

This iterates over the first 20 unique_ptrs in the vector, which a are null, and attempts to dereference the null ptrs in a futile attempt to join them. This ends in tears, and segmentation faults.

I should mention one more comment: I also did not immediately see the bug in the shown program, right away. But I simply loaded your program in my debugger, set a breakpoint, and the bug became obvious in just a few seconds:

(gdb) p greeters
$2 = std::vector of length 40, capacity 40 = {std::unique_ptr<std::thread> = {
    get() = 0x0}, std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x0}, 
  std::unique_ptr<std::thread> = {get() = 0x418f60}, 
  std::unique_ptr<std::thread> = {get() = 0x419220}, 
  std::unique_ptr<std::thread> = {get() = 0x418f80},
...

Why is this vector twice as large as it should be, and why is the first half of it empty unique_ptrs? Oh...

This is why it's a good idea to invest some time in learning how to use a debugger in order to be able to solve these Scooby-Doo mysteries, all by yourself.


Get rid of all that indirection. It's not needed. You can and should create vectors of threads and not go through two layers of indirection:

[temp]$ cat test.cpp
#include <thread>
#include <vector>
#include <iostream>

int main() {
    std::vector<std::thread> vec;
    std::thread thr;
    vec.push_back(std::move(thr));
    std::cout << "Hello, world\n";
    return 0;
}
[temp]$ clang++ -std=c++11 test.cpp
[temp]$ ./a.out
Hello, world
[temp]$