What happens to a detached thread when main() exits?

Solution 1:

The answer to the original question "what happens to a detached thread when main() exits" is:

It continues running (because the standard doesn't say it is stopped), and that's well-defined, as long as it touches neither (automatic|thread_local) variables of other threads nor static objects.

This appears to be allowed to allow thread managers as static objects (note in [basic.start.term]/4 says as much, thanks to @dyp for the pointer).

Problems arise when the destruction of static objects has finished, because then execution enters a regime where only code allowed in signal handlers may execute ([basic.start.term]/1, 1st sentence). Of the C++ standard library, that is only the <atomic> library ([support.runtime]/9, 2nd sentence). In particular, that—in general—excludes condition_variable (it's implementation-defined whether that is save to use in a signal handler, because it's not part of <atomic>).

Unless you've unwound your stack at this point, it's hard to see how to avoid undefined behaviour.

The answer to the second question "can detached threads ever be joined again" is:

Yes, with the *_at_thread_exit family of functions (notify_all_at_thread_exit(), std::promise::set_value_at_thread_exit(), ...).

As noted in footnote [2] of the question, signalling a condition variable or a semaphore or an atomic counter is not sufficient to join a detached thread (in the sense of ensuring that the end of its execution has-happened-before the receiving of said signalling by a waiting thread), because, in general, there will be more code executed after e.g. a notify_all() of a condition variable, in particular the destructors of automatic and thread-local objects.

Running the signalling as the last thing the thread does (after destructors of automatic and thread-local objects has-happened) is what the _at_thread_exit family of functions was designed for.

So, in order to avoid undefined behaviour in the absence of any implementation guarantees above what the standard requires, you need to (manually) join a detached thread with an _at_thread_exit function doing the signalling or make the detached thread execute only code that would be safe for a signal handler, too.

Solution 2:

Detaching Threads

According to std::thread::detach:

Separates the thread of execution from the thread object, allowing execution to continue independently. Any allocated resources will be freed once the thread exits.

From pthread_detach:

The pthread_detach() function shall indicate to the implementation that storage for the thread can be reclaimed when that thread terminates. If thread has not terminated, pthread_detach() shall not cause it to terminate. The effect of multiple pthread_detach() calls on the same target thread is unspecified.

Detaching threads is mainly for saving resources, in case the application does not need to wait for a thread to finish (e.g. daemons, which must run until process termination):

  1. To free the application side handle: One can let a std::thread object go out of scope without joining, what normally leads to a call to std::terminate() on destruction.
  2. To allow the OS to cleanup the thread specific resources (TCB) automatically as soon as the thread exits, because we explicitly specified, that we aren't interested in joining the thread later on, thus, one cannot join an already detached thread.

Killing Threads

The behavior on process termination is the same as the one for the main thread, which could at least catch some signals. Whether or not other threads can handle signals is not that important, as one could join or terminate other threads within the main thread's signal handler invocation. (Related question)

As already stated, any thread, whether detached or not, will die with its process on most OSes. The process itself can be terminated by raising a signal, by calling exit() or by returning from the main function. However, C++11 cannot and does not try to define the exact behaviour of the underlying OS, whereas the developers of a Java VM can surely abstract such differences to some extent. AFAIK, exotic process and threading models are usually found on ancient platforms (to which C++11 probably won't be ported) and various embedded systems, which could have a special and/or limited language library implementation and also limited language support.

Thread Support

If threads aren't supported std::thread::get_id() should return an invalid id (default constructed std::thread::id) as there's a plain process, which does not need a thread object to run and the constructor of a std::thread should throw a std::system_error. This is how I understand C++11 in conjunction with today's OSes. If there's an OS with threading support, which doesn't spawn a main thread in its processes, let me know.

Controlling Threads

If one needs to keep control over a thread for proper shutdown, one can do that by using sync primitives and/or some sort of flags. However, In this case, setting a shutdown flag followed by a join is the way I prefer, since there's no point in increasing complexity by detaching threads, as the resources would be freed at the same time anyway, where the few bytes of the std::thread object vs. higher complexity and possibly more sync primitives should be acceptable.

Solution 3:

Consider the following code:

#include <iostream>
#include <string>
#include <thread>
#include <chrono>

void thread_fn() {
  std::this_thread::sleep_for (std::chrono::seconds(1)); 
  std::cout << "Inside thread function\n";   
}

int main()
{
    std::thread t1(thread_fn);
    t1.detach();

    return 0; 
}

Running it on a Linux system, the message from the thread_fn is never printed. The OS indeed cleans up thread_fn() as soon as main() exits. Replacing t1.detach() with t1.join() always prints the message as expected.