How should strace be used?

Solution 1:

Strace Overview
strace can be seen as a light weight debugger. It allows a programmer / user to quickly find out how a program is interacting with the OS. It does this by monitoring system calls and signals.

Uses
Good for when you don't have source code or don't want to be bothered to really go through it.
Also, useful for your own code if you don't feel like opening up GDB, but are just interested in understanding external interaction.

A good little introduction
I ran into this intro to strace use just the other day: strace hello world

Solution 2:

In simple words, strace traces all system calls issued by a program along with their return codes. Think things such as file/socket operations and a lot more obscure ones.

It is most useful if you have some working knowledge of C since here system calls would more accurately stand for standard C library calls.

Let's say your program is /usr/local/bin/cough. Simply use:

strace /usr/local/bin/cough <any required argument for cough here>

or

strace -o <out_file> /usr/local/bin/cough <any required argument for cough here>

to write into 'out_file'.

All strace output will go to stderr (beware, the sheer volume of it often asks for a redirection to a file). In the simplest cases, your program will abort with an error and you'll be able to see what where its last interactions with the OS in strace output.

More information should be available with:

man strace

Solution 3:

strace lists all system calls done by the process it's applied to. If you don't know what system calls mean, you won't be able to get much mileage from it.

Nevertheless, if your problem involves files or paths or environment values, running strace on the problematic program and redirecting the output to a file and then grepping that file for your path/file/env string may help you see what your program is actually attempting to do, as distinct from what you expected it to.

Solution 4:

Strace stands out as a tool for investigating production systems where you can't afford to run these programs under a debugger. In particular, we have used strace in the following two situations:

  • Program foo seems to be in deadlock and has become unresponsive. This could be a target for gdb; however, we haven't always had the source code or sometimes were dealing with scripted languages that weren't straight-forward to run under a debugger. In this case, you run strace on an already running program and you will get the list of system calls being made. This is particularly useful if you are investigating a client/server application or an application that interacts with a database
  • Investigating why a program is slow. In particular, we had just moved to a new distributed file system and the new throughput of the system was very slow. You can specify strace with the '-T' option which will tell you how much time was spent in each system call. This helped to determine why the file system was causing things to slow down.

For an example of analyzing using strace see my answer to this question.