tcpdump increases udp performance
I'm running a set of load tests to determine the performance of the following setup:
Node.js test suite (client) --> StatsD (server) --> Graphite (server)
In short, the node.js test suite sends a set amount of metrics every x seconds to a StatsD instance which is located on another server. StatsD then in turn flushes the metrics every second to a Graphite instance located on the same server. I then look at how many metrics were actually sent by the test suite and how many were received by Graphite to determine the packet loss between the test suite and Graphite.
However I noticed that I sometimes got very large packet drop rates (note that it's being sent with the UDP protocol), ranging from 20-50%. So that's when I started looking into where these packets were being dropped, seeing as it could be some performance issue with StatsD. So I started logging the metrics in every part of the system to track down where this drop occured. And this is where things get weird.
I'm using tcpdump to create a capture file which I inspect after the test is done running. But whenever I run the tests with tcpdump running, the packet loss is almost nonexistent! It looks like tcpdump is somehow increasing the performance of my tests and I can't figure out why and how it does this. I'm running the following command to log the tcpdump messages on both server and client:
tcpdump -i any -n port 8125 -w test.cap
In one particular test case I'm sending 40000 metrics/s. The test while running tcpdump has a packet loss of about 4% while the one without has a packet loss of about 20%
Both systems are running as Xen VM's with the following setup:
- Intel Xeon E5-2630 v2 @ 2.60GHz
- 2GB RAM
- Ubuntu 14.04 x86_64
Things I already checked for potential causes:
- Increasing the UDP buffer receive/send size.
- CPU load affecting the test. (max. load of 40-50%, both client and server side)
- Running tcpdump on specific interfaces instead of 'any'.
- Running tcpdump with '-p' to disable promiscuous mode.
- Running tcpdump only on server. This resulted in the packet loss of 20% occuring and seems to not impact the tests.
- Running tcpdump only on the client. This resulted in increased performance.
- Increasing netdev_max_backlog and netdev_budget to 2^32-1. This made no difference.
- Tried every possible setting of promiscuous mode on every nic (server on and client off, server off and client on, both on, both off). This made no difference.
When tcpdump is running, it will be fairly prompt at reading in the incoming frames. My hypothesis is that the NIC's packet ring buffer settings may be a bit on the small size; when tcpdump is running it is getting emptied in a more timely manner.
If you're a Red Hat subscriber, then this support article is very useful Overview of Packet Reception. It has some things in there that I don't think you've considered yet.
Consider how your system is dealing with IRQs; consider increasing the 'dev_weight' of the network interface (meaning more packets read from NIC to user-space); look at how often the application reads the socket (can it use a dedicated thread, are there known issues/workaround regarding scalability).
Increase NIC frame buffer (using the ethtool
command -- look at the --set-ring
etc. arguments).
Look at 'receive side scaling' and use at least that many receive threads to read in the traffic.
I wonder if tcpdump is doing something cool such as using the kernel support for packet ring buffers. That would help to explain the behaviour you are seeing.
What power governor are you using? I've seen similar behaviors with "ondemand" or "conservative" governor.
Try to use the "performance" governor and to disable any powersaving features in the server BIOS.
Does it change something?