How can I find the total number of TCP connections for a given port and period of time by IP?

Solution 1:

Turn on iptables and set it to LOG for incoming connections. Example rule:

 -A INPUT --state NEW -p tcp --dport 4711 -j LOG

(where 4711 is the port you want to track).

Then run the resulting log through whatever script you like that can do the summary for you.

Solution 2:

You can use tcpdump to log all SYN (without ACK) packets:

tcpdump "dst port 4711 and tcp[tcpflags] & (tcp-syn|tcp-ack) == tcp-syn"

or log all SYN+ACK packets (established connections):

tcpdump "src port 4711 and tcp[tcpflags] & (tcp-syn|tcp-ack) == (tcp-syn|tcp-ack)"

And then combine it with a wc -l to count all lines

You'd also need a way to measure fixed periods of time (you could have a cron just send it a SIGINT at regular intervals, tcpdump will count bytes and packets but only logs time)

Update: not necessary to say, have a look to the man page of tcpdump and consider using some options like: -i (listen to only one interface), -p (disable promiscuous mode; less invasive), or some output options. Tcpdump needs root permissions and your boss may not like it because it is kind of a hacker tool. On the other hand, you don't need to touch anything on your system to run it (in contrast to the iptables LOG solution)

Please also remark the small src/dsk difference in the filter. If you catch SYN+ACK packets and want to count connections to a server at port 4711 you need src. If you are catching SYN+!ACK packets for the same result, you need dst. If you count connections on the server itself, you always have to use the reverse.

Solution 3:

SystemTap solution

Script inspired by the tcp_connections.stp example:

#!/usr/bin/env stap
# To monitor another TCP port run:
#     stap -G port=80 tcp_connections.stp
# or
#     ./tcp_connections.stp -G port=80
global port = 22
global connections

function report() {
  foreach (addr in connections) {
    printf("%s: %d\n", addr, @count(connections[addr]))
  }
}

probe end {
  printf("\n=== Summary ===\n")
  report()
}

probe kernel.function("tcp_accept").return?,
      kernel.function("inet_csk_accept").return? {
  sock = $return
  if (sock != 0) {
    local_port = inet_get_local_port(sock)
    if (local_port == port) {
      remote_addr = inet_get_ip_source(sock)
      connections[remote_addr] <<< 1
      printf("%s New connection from %s\n", ctime(gettimeofday_s()), remote_addr)
    }
  }
}

Output:

[root@bubu ~]# ./tcp_connections.stp -G port=80
Mon Mar 17 04:13:03 2014 New connection from 192.168.122.1
Mon Mar 17 04:13:04 2014 New connection from 192.168.122.1
Mon Mar 17 04:13:08 2014 New connection from 192.168.122.4
^C
=== Summary ===
192.168.122.1: 2
192.168.122.4: 1

strace solution

Either start the program under strace:

strace -r -f -e trace=accept -o /tmp/strace ${PROGRAM} ${ARGS}

or trace an already running program:

strace -r -f -e trace=accept -o /tmp/strace -p ${PID_OF_PROGRAM}

-r prints a relative timestamp upon entry to each system call in case it's needed later for extra performance analysis. -f traces child processes and it might not be needed.

The output looks something like this:

999        0.000000 accept(3, {sa_family=AF_INET, sin_port=htons(34702), sin_addr=inet_addr("192.168.122.4")}, [16]) = 5
999        0.008079 --- SIGCHLD (Child exited) @ 0 (0) ---
999        1.029846 accept(3, {sa_family=AF_INET, sin_port=htons(34703), sin_addr=inet_addr("192.168.122.4")}, [16]) = 5
999        0.008276 --- SIGCHLD (Child exited) @ 0 (0) ---
999        3.580122 accept(3, {sa_family=AF_INET, sin_port=htons(50114), sin_addr=inet_addr("192.168.122.1")}, [16]) = 5

and can be filtered with:

# gawk 'match($0, /^([0-9]+)[[:space:]]+([0-9.]+)[[:space:]]+accept\(.*htons\(([^)]+)\),.*inet_addr\("([^"]+)"\).*[[:space:]]+=[[:space:]]+([1-9][0-9]*)/, m) {connections[m[4]]++} END {for (addr in connections) printf("%s: %d\n", addr, connections[addr]); }' /tmp/strace
192.168.122.4: 3
192.168.122.1: 2

Short explanation of the AKW one-liner: m[1] is the PID, m[2] is the timestamp, m[3] is the remote port and m[4] is the remote address.

The advantage of this solution is that root is not required if the server runs under the same user. The disadvantage is that all connections are counted, there's no filtering, so it won't work if the application listens on multiple ports.

Solution 4:

Your system won't remember counts of past connections unless you tell it to, so don't expect to find counters like you have for total traffic through an interface unless you set something up to do that counting.

Also, in general, you cannot reliably do this counting by polling, as Jacek Lakomiec suggested, as some connections will start and finish faster than your polling period. That sort of approach might be acceptable for some situations where you are sure that the time connections are made for will be long enough, but I can't think of good reasons to prefer it.

As suggested by Jenny D and Daniel Alder, your options for counting connections as they occur are basically firewall based counters and packet-capture based counters. Both will generally work well, although if your system is CPU constrained, you may fail to count some connections if you use the packet based approach, and also it's likely to consume more system resources to do the counting. On the other hand, packet capture based approaches can be simpler and safer to set up for ad-hoc investigations.

There is another general class of solution, which is netflow. It's more involved to set up, but if it's done right, it's particularly efficient, and if you are doing large-scale, or ongoing monitoring I'd look in this direction. Capturing the raw data can be done in your firewall (eg fprobe-ulo) or using libpcap which is slower (eg fprobeg). The capture system sends flow data via the network to a collector (eg nfdump), and you then have a variety of tools for analyzing that data (eg nfsen).

Some routers (particularly cisco gear) come with netflow capture, and it can also be configured into other routers via third party firmware, or of course you can run it on your linux system. If you wish, many collection points can forward their flow data to a single collector. You can find free software options at eg http://www.networkuptime.com/tools/netflow/, and there are also many commercial offerings.

Netflow is designed for industrial scale use, but I've found it very serviceable for collecting data on use of my home network in a share-house so that I can identify who or what is responsible when traffic usage is higher than expected.

Be careful any time you're messing with firewall rules on a remote server, and in general I'd recommend finding a good front end to configure your firewall rather than issuing iptables commands directly. (I like ferm, but there are many good ones).

One other thing to think about - sometimes you don't want to do this at the network layer at all. Sometimes it's appropriate to monitor the daemon process's system calls with strace or similar. It's CPU intensive, and be careful of slowing down the Daemon process, but in some circumstances, it can be appropriate, depending mostly on what other info you need to gather at the same time, or perhaps if you need to isolate a single forked child of the daemon.

Solution 5:

So far the solution that worked best for me was to just grab the contents of /proc/net/ip_conntrack every 20 seconds, log that into a file with file name containing appropriate timestamp and using those as input to any of the filtering scripts, or even oneliners when necessary. To save you time you can use my script. I use crontab entries to make sure the script is ran every minute (it lasts for 60 seconds in the current configuration, feel free to modify it :-)

 cat conn_minute.sh
#!/bin/bash

function save_log {
LOG_DIR=/mnt/logs/ip_conntrack/`date +%Y%m%d`
TEMP_FILE=$LOG_DIR/`date +%Y%m%d_%H%M%S`.gz
LOG_FILE=$LOG_DIR/`date +%Y%m%d_%H`.tar
if [ ! -d $LOG_DIR ]
then
    mkdir $LOG_DIR
fi
gzip -c /proc/net/ip_conntrack > $TEMP_FILE
if [ -f $LOG_FILE ]; then
    tar -rf $LOG_FILE $TEMP_FILE 2> /dev/null
else
    tar -cf $LOG_FILE $TEMP_FILE 2> /dev/null
fi
rm $TEMP_FILE
}
function log_minute {
i=1;
LOOP_COUNTER=3
LOOP_TIME=20
while [ $i -le $LOOP_COUNTER ]; do
    save_log
    i=$[i+1]
    sleep $LOOP_TIME
done
}

log_minute

You can adjust how often you want to dump the content of ip_conntrack by changing LOOP_COUNTER and LOOP_TIME accordingly. So to get it every 5 secs, it would be: LOOP_COUNTER=12 , LOOP_TIME=5. LOG_DIR is imply where the logs would be saved to.

Afterwards you can use zcat to cat files you're interested in and use grep to filter source IPs/ports of your interest (or just use zgrep). grep -c will count whatever you're after. You can also use grep src=1.2.3.4 | grep dport=63793 | sort | uniq | wc -l.