How to automatically generate a stacktrace when my program crashes
For Linux and I believe Mac OS X, if you're using gcc, or any compiler that uses glibc, you can use the backtrace() functions in execinfo.h
to print a stacktrace and exit gracefully when you get a segmentation fault. Documentation can be found in the libc manual.
Here's an example program that installs a SIGSEGV
handler and prints a stacktrace to stderr
when it segfaults. The baz()
function here causes the segfault that triggers the handler:
#include <stdio.h>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
void handler(int sig) {
void *array[10];
size_t size;
// get void*'s for all entries on the stack
size = backtrace(array, 10);
// print out all the frames to stderr
fprintf(stderr, "Error: signal %d:\n", sig);
backtrace_symbols_fd(array, size, STDERR_FILENO);
exit(1);
}
void baz() {
int *foo = (int*)-1; // make a bad pointer
printf("%d\n", *foo); // causes segfault
}
void bar() { baz(); }
void foo() { bar(); }
int main(int argc, char **argv) {
signal(SIGSEGV, handler); // install our handler
foo(); // this will call foo, bar, and baz. baz segfaults.
}
Compiling with -g -rdynamic
gets you symbol info in your output, which glibc can use to make a nice stacktrace:
$ gcc -g -rdynamic ./test.c -o test
Executing this gets you this output:
$ ./test
Error: signal 11:
./test(handler+0x19)[0x400911]
/lib64/tls/libc.so.6[0x3a9b92e380]
./test(baz+0x14)[0x400962]
./test(bar+0xe)[0x400983]
./test(foo+0xe)[0x400993]
./test(main+0x28)[0x4009bd]
/lib64/tls/libc.so.6(__libc_start_main+0xdb)[0x3a9b91c4bb]
./test[0x40086a]
This shows the load module, offset, and function that each frame in the stack came from. Here you can see the signal handler on top of the stack, and the libc functions before main
in addition to main
, foo
, bar
, and baz
.
It's even easier than "man backtrace", there's a little-documented library (GNU specific) distributed with glibc as libSegFault.so, which was I believe was written by Ulrich Drepper to support the program catchsegv (see "man catchsegv").
This gives us 3 possibilities. Instead of running "program -o hai":
-
Run within catchsegv:
$ catchsegv program -o hai
-
Link with libSegFault at runtime:
$ LD_PRELOAD=/lib/libSegFault.so program -o hai
-
Link with libSegFault at compile time:
$ gcc -g1 -lSegFault -o program program.cc $ program -o hai
In all 3 cases, you will get clearer backtraces with less optimization (gcc -O0 or -O1) and debugging symbols (gcc -g). Otherwise, you may just end up with a pile of memory addresses.
You can also catch more signals for stack traces with something like:
$ export SEGFAULT_SIGNALS="all" # "all" signals
$ export SEGFAULT_SIGNALS="bus abrt" # SIGBUS and SIGABRT
The output will look something like this (notice the backtrace at the bottom):
*** Segmentation fault Register dump:
EAX: 0000000c EBX: 00000080 ECX:
00000000 EDX: 0000000c ESI:
bfdbf080 EDI: 080497e0 EBP:
bfdbee38 ESP: bfdbee20
EIP: 0805640f EFLAGS: 00010282
CS: 0073 DS: 007b ES: 007b FS:
0000 GS: 0033 SS: 007b
Trap: 0000000e Error: 00000004
OldMask: 00000000 ESP/signal:
bfdbee20 CR2: 00000024
FPUCW: ffff037f FPUSW: ffff0000
TAG: ffffffff IPOFF: 00000000
CSSEL: 0000 DATAOFF: 00000000
DATASEL: 0000
ST(0) 0000 0000000000000000 ST(1)
0000 0000000000000000 ST(2) 0000
0000000000000000 ST(3) 0000
0000000000000000 ST(4) 0000
0000000000000000 ST(5) 0000
0000000000000000 ST(6) 0000
0000000000000000 ST(7) 0000
0000000000000000
Backtrace:
/lib/libSegFault.so[0xb7f9e100]
??:0(??)[0xb7fa3400]
/usr/include/c++/4.3/bits/stl_queue.h:226(_ZNSt5queueISsSt5dequeISsSaISsEEE4pushERKSs)[0x805647a]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/player.cpp:73(_ZN6Player5inputESs)[0x805377c]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:159(_ZN6Socket4ReadEv)[0x8050698]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:413(_ZN12ServerSocket4ReadEv)[0x80507ad]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:300(_ZN12ServerSocket4pollEv)[0x8050b44]
/home/dbingham/src/middle-earth-mud/alpha6/src/engine/main.cpp:34(main)[0x8049a72]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7d1b775]
/build/buildd/glibc-2.9/csu/../sysdeps/i386/elf/start.S:122(_start)[0x8049801]
If you want to know the gory details, the best source is unfortunately the source: See http://sourceware.org/git/?p=glibc.git;a=blob;f=debug/segfault.c and its parent directory http://sourceware.org/git/?p=glibc.git;a=tree;f=debug
Linux
While the use of the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault has already been suggested, I see no mention of the intricacies necessary to ensure the resulting backtrace points to the actual location of the fault (at least for some architectures - x86 & ARM).
The first two entries in the stack frame chain when you get into the signal handler contain a return address inside the signal handler and one inside sigaction() in libc. The stack frame of the last function called before the signal (which is the location of the fault) is lost.
Code
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#ifndef __USE_GNU
#define __USE_GNU
#endif
#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ucontext.h>
#include <unistd.h>
/* This structure mirrors the one found in /usr/include/asm/ucontext.h */
typedef struct _sig_ucontext {
unsigned long uc_flags;
ucontext_t *uc_link;
stack_t uc_stack;
sigcontext_t uc_mcontext;
sigset_t uc_sigmask;
} sig_ucontext_t;
void crit_err_hdlr(int sig_num, siginfo_t * info, void * ucontext)
{
void * array[50];
void * caller_address;
char ** messages;
int size, i;
sig_ucontext_t * uc;
uc = (sig_ucontext_t *)ucontext;
/* Get the address at the time the signal was raised */
#if defined(__i386__) // gcc specific
caller_address = (void *) uc->uc_mcontext.eip; // EIP: x86 specific
#elif defined(__x86_64__) // gcc specific
caller_address = (void *) uc->uc_mcontext.rip; // RIP: x86_64 specific
#else
#error Unsupported architecture. // TODO: Add support for other arch.
#endif
fprintf(stderr, "signal %d (%s), address is %p from %p\n",
sig_num, strsignal(sig_num), info->si_addr,
(void *)caller_address);
size = backtrace(array, 50);
/* overwrite sigaction with caller's address */
array[1] = caller_address;
messages = backtrace_symbols(array, size);
/* skip first stack frame (points here) */
for (i = 1; i < size && messages != NULL; ++i)
{
fprintf(stderr, "[bt]: (%d) %s\n", i, messages[i]);
}
free(messages);
exit(EXIT_FAILURE);
}
int crash()
{
char * p = NULL;
*p = 0;
return 0;
}
int foo4()
{
crash();
return 0;
}
int foo3()
{
foo4();
return 0;
}
int foo2()
{
foo3();
return 0;
}
int foo1()
{
foo2();
return 0;
}
int main(int argc, char ** argv)
{
struct sigaction sigact;
sigact.sa_sigaction = crit_err_hdlr;
sigact.sa_flags = SA_RESTART | SA_SIGINFO;
if (sigaction(SIGSEGV, &sigact, (struct sigaction *)NULL) != 0)
{
fprintf(stderr, "error setting signal handler for %d (%s)\n",
SIGSEGV, strsignal(SIGSEGV));
exit(EXIT_FAILURE);
}
foo1();
exit(EXIT_SUCCESS);
}
Output
signal 11 (Segmentation fault), address is (nil) from 0x8c50
[bt]: (1) ./test(crash+0x24) [0x8c50]
[bt]: (2) ./test(foo4+0x10) [0x8c70]
[bt]: (3) ./test(foo3+0x10) [0x8c8c]
[bt]: (4) ./test(foo2+0x10) [0x8ca8]
[bt]: (5) ./test(foo1+0x10) [0x8cc4]
[bt]: (6) ./test(main+0x74) [0x8d44]
[bt]: (7) /lib/libc.so.6(__libc_start_main+0xa8) [0x40032e44]
All the hazards of calling the backtrace() functions in a signal handler still exist and should not be overlooked, but I find the functionality I described here quite helpful in debugging crashes.
It is important to note that the example I provided is developed/tested on Linux for x86. I have also successfully implemented this on ARM using uc_mcontext.arm_pc
instead of uc_mcontext.eip
.
Here's a link to the article where I learned the details for this implementation: http://www.linuxjournal.com/article/6391
Even though a correct answer has been provided that describes how to use the GNU libc backtrace()
function1 and I provided my own answer that describes how to ensure a backtrace from a signal handler points to the actual location of the fault2, I don't see any mention of demangling C++ symbols output from the backtrace.
When obtaining backtraces from a C++ program, the output can be run through c++filt
1 to demangle the symbols or by using abi::__cxa_demangle
1 directly.
-
1 Linux & OS X
Note that
c++filt
and__cxa_demangle
are GCC specific - 2 Linux
The following C++ Linux example uses the same signal handler as my other answer and demonstrates how c++filt
can be used to demangle the symbols.
Code:
class foo
{
public:
foo() { foo1(); }
private:
void foo1() { foo2(); }
void foo2() { foo3(); }
void foo3() { foo4(); }
void foo4() { crash(); }
void crash() { char * p = NULL; *p = 0; }
};
int main(int argc, char ** argv)
{
// Setup signal handler for SIGSEGV
...
foo * f = new foo();
return 0;
}
Output (./test
):
signal 11 (Segmentation fault), address is (nil) from 0x8048e07
[bt]: (1) ./test(crash__3foo+0x13) [0x8048e07]
[bt]: (2) ./test(foo4__3foo+0x12) [0x8048dee]
[bt]: (3) ./test(foo3__3foo+0x12) [0x8048dd6]
[bt]: (4) ./test(foo2__3foo+0x12) [0x8048dbe]
[bt]: (5) ./test(foo1__3foo+0x12) [0x8048da6]
[bt]: (6) ./test(__3foo+0x12) [0x8048d8e]
[bt]: (7) ./test(main+0xe0) [0x8048d18]
[bt]: (8) ./test(__libc_start_main+0x95) [0x42017589]
[bt]: (9) ./test(__register_frame_info+0x3d) [0x8048981]
Demangled Output (./test 2>&1 | c++filt
):
signal 11 (Segmentation fault), address is (nil) from 0x8048e07
[bt]: (1) ./test(foo::crash(void)+0x13) [0x8048e07]
[bt]: (2) ./test(foo::foo4(void)+0x12) [0x8048dee]
[bt]: (3) ./test(foo::foo3(void)+0x12) [0x8048dd6]
[bt]: (4) ./test(foo::foo2(void)+0x12) [0x8048dbe]
[bt]: (5) ./test(foo::foo1(void)+0x12) [0x8048da6]
[bt]: (6) ./test(foo::foo(void)+0x12) [0x8048d8e]
[bt]: (7) ./test(main+0xe0) [0x8048d18]
[bt]: (8) ./test(__libc_start_main+0x95) [0x42017589]
[bt]: (9) ./test(__register_frame_info+0x3d) [0x8048981]
The following builds on the signal handler from my original answer and can replace the signal handler in the above example to demonstrate how abi::__cxa_demangle
can be used to demangle the symbols. This signal handler produces the same demangled output as the above example.
Code:
void crit_err_hdlr(int sig_num, siginfo_t * info, void * ucontext)
{
sig_ucontext_t * uc = (sig_ucontext_t *)ucontext;
void * caller_address = (void *) uc->uc_mcontext.eip; // x86 specific
std::cerr << "signal " << sig_num
<< " (" << strsignal(sig_num) << "), address is "
<< info->si_addr << " from " << caller_address
<< std::endl << std::endl;
void * array[50];
int size = backtrace(array, 50);
array[1] = caller_address;
char ** messages = backtrace_symbols(array, size);
// skip first stack frame (points here)
for (int i = 1; i < size && messages != NULL; ++i)
{
char *mangled_name = 0, *offset_begin = 0, *offset_end = 0;
// find parantheses and +address offset surrounding mangled name
for (char *p = messages[i]; *p; ++p)
{
if (*p == '(')
{
mangled_name = p;
}
else if (*p == '+')
{
offset_begin = p;
}
else if (*p == ')')
{
offset_end = p;
break;
}
}
// if the line could be processed, attempt to demangle the symbol
if (mangled_name && offset_begin && offset_end &&
mangled_name < offset_begin)
{
*mangled_name++ = '\0';
*offset_begin++ = '\0';
*offset_end++ = '\0';
int status;
char * real_name = abi::__cxa_demangle(mangled_name, 0, 0, &status);
// if demangling is successful, output the demangled function name
if (status == 0)
{
std::cerr << "[bt]: (" << i << ") " << messages[i] << " : "
<< real_name << "+" << offset_begin << offset_end
<< std::endl;
}
// otherwise, output the mangled function name
else
{
std::cerr << "[bt]: (" << i << ") " << messages[i] << " : "
<< mangled_name << "+" << offset_begin << offset_end
<< std::endl;
}
free(real_name);
}
// otherwise, print the whole line
else
{
std::cerr << "[bt]: (" << i << ") " << messages[i] << std::endl;
}
}
std::cerr << std::endl;
free(messages);
exit(EXIT_FAILURE);
}
Might be worth looking at Google Breakpad, a cross-platform crash dump generator and tools to process the dumps.