when g++ static link pthread, cause Segmentation fault, why?

First, the solution. This here will work:

Update: Since Ubuntu 18.04, you need to link also against librt (add -lrt):

g++ -o one one.cpp -Wall -std=c++11 -O3 -static -lrt -pthread \
    -Wl,--whole-archive -lpthread -Wl,--no-whole-archive

(continue with original answer)

g++ -o one one.cpp -Wall -std=c++11 -O3 -static -pthread \
    -Wl,--whole-archive -lpthread -Wl,--no-whole-archive

When you use -pthread, the compiler will already link against pthread (and depending on the platform, it does define extra macros like -D_REENTRANT, see this question for more details).

So if -pthread implies -lpthread, why do you need you specify -lpthread when you are linking statically? And what does Wl,--whole-archive do?

Understanding weak symbols

On Unix, the ELF file format is used, which has the concept of weak and strong symbols. To quote from the Wikipedia page:

By default, without any annotation, a symbol in an object file is strong. During linking, a strong symbol can override a weak symbol of the same name. In contrast, two strong symbols that share a name yield a link error during link-time.

There is a subtle difference when it comes to dynamic and static libraries. In static libraries, the linker will stop at the first symbol, even if it is a weak one, and stops looking for strong ones. To force it to look at all symbols (like it would have done for a dynamically linked library), ld supports the --whole-archive option.

To quote from man ld:

--whole-archive: For each archive mentioned on the command line after the --whole-archive option, include every object file in the archive in the link, rather than searching the archive for the required object files. This is normally used to turn an archive file into a shared library, forcing every object to be included in the resulting shared library. This option may be used more than once.

It goes on by explaining that from gcc, you have to pass the option as -Wl,--whole-archive:

Two notes when using this option from gcc: First, gcc doesn't know about this option, so you have to use -Wl,-whole-archive. Second, don't forget to use -Wl,-no-whole-archive after your list of archives, because gcc will add its own list of archives to your link and you may not want this flag to affect those as well.

And it explains how to turn it off, again:

--no-whole-archive: Turn off the effect of the --whole-archive option for subsequent archive files.

Weak symbols in pthread and libstdc++

One of the use cases of weak symbols is to be able to swap out implementations with optimized ones. Another is to use stubs, which can later to replaced if necessary.

For example, fputc (conceptionally used by printf) is required by POSIX to be thread-safe and needs to be synchronized, which is costly. In a single-threaded environment, you do not want to pay the costs. An implementation could therefore implement the synchronization functions as empty stubs, and declare the functions as weak symbols.

Later, if a multi-threading library is linked (e.g., pthread), it becomes obvious that single-thread support is not intended. When linking the multi-threading library, the linker can then replace the stubs by the real synchronization functions (defined as strong symbols and implemented by the threading-library). On the other hand, if no multi-threading library is linked, the executable will use the stubs for the synchronization function.

glibc (providing fputc) and pthreads seem to use exactly this trick. For details, refer to this question about the usage of weak symbols in glibc. The example above is taken from this answer.

nm allows you to look at it in detail, which seems consistent with the cited answer above:

$ nm /usr/lib/libc.a 2>/dev/null | grep pthread_mutex_lock
w __pthread_mutex_lock
... (repeats)

"w" stands for "weak", so the statically linked libc library contains __pthread_mutex_lock as a weak symbol. The statically linked pthread library contains it as a strong symbol:

$ nm /usr/lib/libpthread.a 2>/dev/null | grep pthread_mutex_lock
             U pthread_mutex_lock
pthread_mutex_lock.o:
00000000000006a0 T __pthread_mutex_lock
00000000000006a0 T pthread_mutex_lock
0000000000000000 t __pthread_mutex_lock_full

Back to the example program

By looking at the shared library dependencies of the dynamically linked executable, I get almost the same output of ldd on my machine:

$ ldd one
linux-vdso.so.1 (0x00007fff79d6d000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fcaaeeb3000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007fcaaeb9b000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fcaae983000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fcaae763000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007fcaae3bb000)
/lib64/ld-linux-x86-64.so.2 (0x00007fcaaf23b000)

Printing out the library calls with ltrace, leads to the following output:

$ ltrace -C ./one 
std::ios_base::Init::Init()(0x563ab8df71b1, 0x7ffdc483cae8, 0x7ffdc483caf8, 160) = 0
__cxa_atexit(0x7fab3023bc20, 0x563ab8df71b1, 0x563ab8df7090, 6)         = 0
operator new(unsigned long)(16, 0x7ffdc483cae8, 0x7ffdc483caf8, 192)    = 0x563ab918bc20
std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)())(0x7ffdc483c990, 0x7ffdc483c998, 0x7fab2fa52320, 0x7fab2fa43a80) = 0
operator new(unsigned long)(16, 0x7fab2f6a1fb0, 0, 0x800000)            = 0x563ab918bd70
std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)())(0x7ffdc483c990, 0x7ffdc483c998, 0x7fab2fa52320, 0x7fab2fa43a80) = 0
operator new(unsigned long)(16, 0x7fab2eea0fb0, 0, 0x800000)            = 0x563ab918bec0
std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)())(0x7ffdc483c990, 0x7ffdc483c998, 0x7fab2fa52320, 0x7fab2fa43a80test start
) = 0
operator new(unsigned long)(16, 0x7fab2e69ffb0, 0, 0x800000)            = 0x563ab918c010
std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)())(0x7ffdc483c990, 0x7ffdc483c998, 0x7fab2fa52320, 0x7fab2fa43a80test start
test start
) = 0
std::thread::join()(0x7ffdc483c9a0, 0x7fab2de9efb0, 0, 0x800000test start
test release
test release
test release
test release
test end
test end
test end
test end
)        = 0
std::thread::join()(0x7ffdc483c9a8, 0x7fab2eea19c0, 0x7fab2f6a2700, 0)  = 0
std::thread::join()(0x7ffdc483c9b0, 0x7fab2e6a09c0, 0x7fab2eea1700, 0)  = 0
std::thread::join()(0x7ffdc483c9b8, 0x7fab2de9f9c0, 0x7fab2e6a0700, 0)  = 0
+++ exited (status 0) +++

As an example, std::thread::join is called, which will most likely use pthread_join internally. That symbol can be found in the (dynamically linked) libraries listed in the ldd ouput, namely in libstdc++.so.6 and libpthread.so.0:

$ nm /usr/lib/libstdc++.so.6 | grep pthread_join
                 w pthread_join

$ nm /usr/lib/libpthread.so.0 | grep pthread_join
0000000000008280 T pthread_join

In the dynamically linked executable, the linker will replace weak symbols by strong symbols. In this example, we have to enforce the same semantic for the statically linked libraries. That is why -Wl,--whole-archive -lpthread -Wl,--no-whole-archive is needed.

Finding it out is a bit trial-and-error. At least, I found no clear documentation on that subject. I assume it is because static linking on Linux has become rather an edge case, whereas dynamic linking is often the canonical approach on how to use the libraries (for a comparison, see Static linking vs dynamic linking). The most extreme example that I have seen and personally struggled with a while to get it working is to link TBB statically.

Appendix: Workaround for Autotools

If you are using autotools as a build system, you need a workaround, as automake does not let you set options in the in LDADD. Unfortunately, you cannot write:

(Makefile.am)
mytarget_LDADD = -Wl,--whole-archive -lpthread -Wl,--no-whole-archive

As a workaround, you can avoid the check by defining the flags in configure.ac, and using them like this:

(configure.ac)
WL_WHOLE_ARCHIVE_HACK="-Wl,--whole-archive"
WL_NO_WHOLE_ARCHIVE_HACK="-Wl,--no-whole-archive"
AC_SUBST(WL_WHOLE_ARCHIVE_HACK)
AC_SUBST(WL_NO_WHOLE_ARCHIVE_HACK)

(Makefile.am)
mytarget_LDADD = @WL_WHOLE_ARCHIVE_HACK@ -lpthread @WL_NO_WHOLE_ARCHIVE_HACK@

I had similar issue when linking a pre-built C++ .a archive that uses pthread. In my case I needed in addition to -Wl,--whole-archive -lpthread -Wl,--no-whole-archive also do -Wl,-u,... for each weak symbol.

My symptom was crashes at runtime and when using gdb to disassemble I could see that the crash was just after a callq 0x0 which seemed suspicious. Did some seaching and found that others had seen this with static pthread linking.

I figured out which symbols to force resolve by using nm and look for w symbols. After linking I could then see that the callq 0x0 instructions had been updated with various symbol addresses to pthread_mutex_lock etc.