How to find out PID of the process sending packets (generating network traffic)?

Solution 1:

What's wrong with the auditctl?

You would do it like this

1) Define your audit rule to audit sendmsg and sendto system calls. These system calls are used during name resolution.

auditctl -a exit,always -F arch=b64 -S sendmsg -S sendto -k send

2) Now search for your audit records. You can grep based on the remote DNS IP here

ausearch -k send -i|grep -A2 "serv:53"

In the below example you can see that application which was responsible for the systemcall is called dig

ausearch -k send -i|grep -A2 "serv:53"
type=SOCKADDR msg=audit(10/31/2016 15:24:56.264:176998) : saddr=inet host:172.16.0.23 serv:53 
type=SYSCALL msg=audit(10/31/2016 15:24:56.264:176998) : arch=x86_64 syscall=sendmsg success=yes exit=29 a0=14 a1=7fa1919f9ac0 a2=0 a3=7fa1919f9780 items=0 ppid=31729 pid=32047 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts5 ses=52 comm=dig exe=/usr/bin/dig subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=send


comm=dig exe=/usr/bin/dig

And the way to differentiate to which remote DNS request is send is here. So you would just have to grep for a particular DNS host.

saddr=inet host:172.16.0.23 serv:53

Or even better - see what DNS hosts are used (I have only one in this example)

ausearch -k send -i|grep "serv:53"|awk '{print $6}'|sort|uniq -c
      3 host:172.16.0.23

And then narrow down which apps are using those particular hosts.

Edit 1: Actually I just did strace of a simple ping to a host. Seems like sendmsg is not always used. Here is what I see

socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.16.0.23")}, 16) = 0
gettimeofday({1477929832, 712018}, NULL) = 0
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
sendto(4, "\3\326\1\0\0\1\0\0\0\0\0\0\tvkontakte\2ru\0\0\1\0\1", 30, MSG_NOSIGNAL, NULL, 0) = 30
poll([{fd=4, events=POLLIN}], 1, 5000)  = 1 ([{fd=4, revents=POLLIN}])
ioctl(4, FIONREAD, [62])                = 0
recvfrom(4, "\3\326\201\200\0\1\0\2\0\0\0\0\tvkontakte\2ru\0\0\1\0\1\300\f"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.16.0.23")}, [16]) = 62
close(4)                                = 0

My previous example was based on dig app, which takes slightly different route in terms of system calls.
So it looks like in majority of cases it would be this rule

auditctl -a exit,always -F arch=b64 -S connect -k connect

Followed by ausearch

ausearch -k connect -i|grep saddr|grep "serv:53"|awk '{print $6}'|sort|uniq -c

Solution 2:

I wrestled with the very same problem a few days ago, and came up with a very simple method. It is based on the fact that the sending process will be waiting for a DNS response to come, on the same port it sent the request from:

Find out the source port of the outgoing DNS request, with iptables -j LOG
Use lsof -i UDP:<source_port> to find out which process is waiting for response on that port.

Of course, as the response arrives within milliseconds, you can't do that manually; moreover, even when automated, there's no guarantee that you will be able to query the system before the DNS response arrives, and the sending process dies. That is why before even executing the above steps, i also configure the kernel Traffic Controller to delay outgoing packets directed to a specific ip/port (using the tc module netem). This allows me to control the time window i have to query the system about which PID is waiting for the DNS response, on the source UDP port obtained in step 1.

I have automated the above steps, including the tc delay, in a small script called ptrap (which is a more general solution, not limited to DNS requests, thus eligible for detection of processes using any TCP/UDP based protocol). With its aid i found out that, in my case, the service contacting the old DNS server was sendmail.

Solution 3:

There is atop. There is a kernel module (netatop) and daemon which will make atop track network usage by process.

You should first install atop

Here is how you install the kernel module. This is valid when the post was written but it can become outdated:

sudo apt install linux-headers-$(uname -r) make zlib1g-dev
wget https://www.atoptool.nl/download/netatop-2.0.tar.gz
tar xvf netatop-2.0.tar.gz
cd netatop-2.0
make
sudo make install
sudo modprobe -v netatop

If you have systemd, create the service file netatopd.service file in /etc/systemd/system/. It would contain:

[Unit]
Description=NetAtop Daemon

[Service]
Type=forking
ExecStart=/usr/sbin/netatopd

[Install]
WantedBy=multi-user.target

Now you can enable the daemon:

sudo systemctl enable netatopd

To see live per-process network usage:

sudo atop -n

To see top 3 network-intensive throughout the day:

atopsar -N

man atopsar for more options.

Solution 4:

There are many options to netstat that show combinations of listening/open sockets over tcp/udp/both. Something like:

$> sudo netstat -pan
Active Internet connections (servers and established)
Proto  Recv-Q Send-Q Local Addr            Foreign Addr           State       PID/Program name
...
tcp    0      1      192.168.66.1:39219    192.168.66.139:2003    SYN_SENT    2045/logstash-forwa

...would have given you a lot of output, but included the source, destination, port numbers, and PID of the process owning those ports.

Solution 5:

+1 for Dmitry's answer above; that worked nicely for me:

auditctl -a exit,always -F arch=b64 -F a0=2 -S socket -k SOCKET

To see the resulting entries, I grep the log file for that "-k" string

grep SOCKET /var/log/audit/audit.log

To get just the interesting fields,

grep SOCKET /var/log/audit/audit.log | \
  cut -d' ' -f 4- | \
  sed "s|^|@\n|g;s| |\n|g" | \
  grep -E "^((exe|uid|comm)=|@)" | \
  tr '\n@' ' \n' |\
  sort -u

(explanation: cut -d' ' -f 4- -> chop the line into fields using space (-d' ') as delimiter, show fields fourth to last ( 4- ) )

(explanation: sed "s|^|@\n|g;s| |\n|g" -> edit line, prepend '@' char-plus-newline to start of line, change spaces to newlines)

(explanation: grep -E "^((uid|comm|exe)=|@)" -> as each field of the original line is now on it's own line, pick out the interesting fields: user-id, command, executable - and the line-start '@' char.)

(explanation: tr '\n@' ' \n' -> now having only the wanted fields, turn the newlines back into spaces, and the prepended '@' back into a newline (which rejoins the fields into one line)

(explanation: sort -u -> sort lines, show only unique lines)

gives me:

uid=0 comm="atop" exe="/usr/bin/atop" 
uid=0 comm="http" exe="/usr/lib/apt/methods/http" 
uid=0 comm="links" exe="/usr/bin/links" 
uid=0 comm="ntpdate" exe="/usr/sbin/ntpdate" 
uid=0 comm="ufdbguardd" exe="/usr/local/ufdbguard/bin/ufdbguardd" 
uid=1000 comm=536F636B657420546872656164 exe="/usr/lib/firefox/firefox" 
uid=1000 comm="clock-applet" exe="/usr/lib/mate-panel/clock-applet" 
uid=1000 comm="pool" exe="/usr/lib/mate-panel/clock-applet" 
uid=105 comm="http" exe="/usr/lib/apt/methods/http" 
uid=105 comm="https" exe="/usr/lib/apt/methods/https" 
uid=135 comm="unbound" exe="/usr/sbin/unbound" 
uid=13 comm="squid" exe="/usr/src/squid-4-master/src/squid" 
uid=1 comm="debsecan" exe="/usr/bin/python2.7"

Commands containing spaces are encoded in simple ascii-to-hex method (see audit_logging.c ). To decode, replace "FF" with "ÿ" and recode that from html to ascii :

grep SOCKET /var/log/audit/audit.log | \
  cut -d' ' -f 4- | sed "s|^|@\n|g;s| |\n|g" | \
  grep -E "^((exe|uid|comm)=|@)" | tr '\n@' ' \n' | \
  sort -u  | sed "s|^[^=]*=||g;s| [^ ]*=| |g" | \
  while read U C E ; do \
    echo "$C" | grep -q '"' || \
      { C=\"`echo $C | sed "s|\(..\)|\&#x\1;|g" | recode h4..u8`\" ; } ; \
    echo "uid=$U comm=$C exe=$E" ; 
  done

(explanation: sed "s|^[^=]=||g;s| [^ ]=| |g" -> edit away the 'xxx=' part of the lines - first: line-start (^) followed by any-char-except-'=' is replaced with blank, then space followed by any-char-except-' ' replaced with space)

(explanation: while read U C E ; do ... done -> loop over each line, reading in each of out three bits of data into U,C,E (userid, command, executable))

(explanation: echo "$C" | grep -q '"' || -> test the command field to see if it contains a doublequote - if not ('||') then do the following: )

(explanation: { C=\"echo $C | sed "s|$..$|\&#x\1;|g" | recode h4..ascii\" ; } -> print the command string, edit each pair of chars 'FF' to be 'ÿ', then pass through gnu 'recode' to turn them from html entities into ascii chars.)

(explanation: echo "uid=$U comm=$C exe=$E" -> print out the modified line)

This gives me output (just showing the decoded line):

uid=1000 comm="Socket Thread" exe="/usr/lib/firefox/firefox

/ j