Excessive 'TCP Dup ACK' & 'TCP Fast Retransmission' causing issues on network. What's causing this?

I'm getting excessive TCP Dup ACK and TCP Fast Retransmission on our network when I transfer files over the MetroEthernet link. The two sites are connected by one sonicwall router, so the sites are only one hop away.

Here is a screenshot from wireshark, and here is the entire capture. In this capture, the client is 192.168.2.153 and the server is 192.168.1.101 Here is a traceroute from my system to the server (ping times are usually steady under 10ms):

user@pc567:~$ ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:e0:b8:c8:0c:7e  
          inet addr:192.168.2.153  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::2e0:b8ff:fec8:c7e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:244994 errors:0 dropped:0 overruns:0 frame:0
          TX packets:149148 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:319571991 (319.5 MB)  TX bytes:12322180 (12.3 MB)
          Interrupt:16 

user@pc567:~$ traceroute -n 192.168.1.101
traceroute to 192.168.1.101 (192.168.1.101), 30 hops max, 60 byte packets
 1  192.168.2.254  0.747 ms  0.706 ms  0.806 ms
 2  192.168.1.101  8.995 ms  9.217 ms  9.477 ms
user@pc567:~$

Any help on what's causing this would be helpful! I can post any more details needed.

UPDATE: Since this started, I've replaced the sonicwall with an 1800 cisco router. The packet capture with it installed had the same results. Since it is a metro ethernet circuit, no router is required. So I've also tried connecting to laptops directly into the service providers equipment at both sites, and putting them on the same subnet. The packet capture looks the same doing it this way. This leads me to believe that there is a problem with the metro ethernet circuit, even though they continue to say nothing is wrong and everything tests ok.


Solution 1:

I realize that this answer is simplified, and not as explicit as I'd like it to be, so if you have questions about a step, please ask!

Scrolling down a bit after opening this file in Wireshark we see some frames in different color. Looks really bad, right? Well, it's not that bad. Hold on, we'll get there.

Checking the SYN packet (frame 37) we see SACK and Window Scaling in the TCP Options. Good. Same thing in in the SYN/ACK (frame 38), SACK and Windows scaling. Awesome. Don't see anything weird regarding SACK.

An estimate of the unloaded RTT is the time between the SYN packet and the first ACK (frame 39). It's about 9.3 ms, which matches your findings. Note that the time between SYN/ACK and ACK (frames 38 and 39) is much lower than between SYN and SYN/ACK (37 and 38). This means that this capture file is taken at the receiver, and to see why that's not ideal, we'll have to go back to school.

Between the sender and the receiver there is one part of the network path that is the smallest, which limits the bandwidth. The RTT estimate we just got from the handshake gives us an estimate of the length of this network path. A measurement of how many packets we can fit in this pipe is the Pipe Capacity or the Bandwidth Delay Product - PC [bits] = R [bits/s] * RTT [s], where R is the smallest bandwidth. Pipe Capacity is then a measurement of volume.

Imagine a garden hose. Its volume is measured is defined by its length and its width in the same way right? To get the most water out of it, it needs to be completely filled with water, otherwise there will be air gaps limiting the water flow. In case we manage to fill it completely, it might overflow. We can use a bucket so that we won't get the floor wet, and if the bucket overflows that doesn't affect the water flow.

Turns out that it's exactly the same in the network path. We need to fill the pipe... In other words, Pipe Capacity is the smallest bytes in flight (how much water we have in the pipe + bucket) between the sender and the receiver that fully utilizes the smallest bandwidth (doesn't cause air gaps). So if the bytes in flight > PC then we're good!

Looking at the TCP trace Statistics -> TCP StreamGraph -> Time Sequence Graph (tcptrace) we can see bytes on the Y-axis, and time on the X-axis. The derivative of this curve is bytes/second, or throughput. Note how the black "line" is flat, meaning throughput is stable! It's interrupted by blue lines a couple of times though (those are the SACK ranges in the duplicate ACKs), but as can be seen it does not affect throughput.

See how the lower right gray solid line (zoom in a bit, that's the ACKs) is really close to the black TCP segments? The time between the TCP segment and the ACK is the RTT, here's it's almost 0! It means that there are not many segments in flight passed this capture point. This in turns means that we can't use that to estimate the bytes in flight, and this is why a sender side packet capture is way better.

Packets here are naturally lost before the capture point. Each data segment that was in flight at the time of the loss triggers a duplicate ACK. Therefore we can use the number of duplicate ACKs to estimate the bytes in flight at the time of the packet loss. Here we see about 9, 16 and 23 segments. Each segment has 1448 bytes of data, so that's gives us a bytes in flight between 13k and 33k. The throughput here was about than 3 Mbit/s (from IO graph) and with the RTT we measured before we get a Pipe Capacity less than 3e6 [bits/s] * 10e-3 [s] / 8 bytes = 3750 bytes, or less than 3 segments.

Because the bytes in flight at the time of these losses are not really the same (hard to tell here with so few samples) I can't really say if these are random losses (that are bad bad bad) or losses occurring because a queue/bucket overflows, but they are occurring when bytes in flight > PC so throughput is not affected.

Your answer seems to indicate that they were indeed random, but not so many to cause low throughput.

Solution 2:

Just now posting what I found out. The MetroEthernet provider came out one Saturday to our main office. They disconnected the network there, and also had someone at a branch nearby. They connected network testing equipment at both ends and were quickly able to determine there was in fact a problem. Several hours later, they were able to isolate the problem. It was a problem with the copper lines from the providers central office, to our main office. They said frames were dropping like crazy, which is what was causing the retransmissions. They fixed the issue with the copper wire at their central office (they said they had to pull apart each wire, one at a time. Sounds like BS to me), but after they did this at their central office, problem was resolved.

Solution 3:

Looking at the capture you provided (thank you for doing that!) I can see a pretty classic retransmit pattern towards the beginning. You can see it around packet 50. There is a missing packet between 51 and 52. What's happening is this:

  1. --> Packet 50 Data
  2. <-- Packet 51 ACK packet 50.
  3. --> Packet 52 Data
  4. <-- Packet 53 ACK packet 50.
  5. --> Packet 54 Data
  6. <-- Packet 55 ACK packet 50.

A data packet got dropped, and the receiver is indicating this by continuing to ACK the packet up to what it's seen so far. What's interesting here is that both sides had TCP SACK Permitted Option = True set when they negotiated the connection, so packet 55 should have a SACK header in it and it doesn't. Selective Acknowledgments allow a receiver to indicate "I've seen everything up to 51, but also 53-55" which reduces the amount of retransmits needed to get things back up to full speed.

What's happening since it can't use SACK is that it falls back to the standard TCP retransmit method of repeating "I've seen up to 50" until the other side figures it out and retransmits everything 50 and later.

There is a retransmit in packet 66, which is immediately followed by an ACK up to packet 56. After the second retransmit (packet 72) the connection is back on track.

First off, it looks likely that the SACK headers are getting stripped out by the sonicwalls which is preventing retransmits from recovering as fast as they've negotiated. Personally, I'm of the opinion that SACK stripping is pointless, but other may disagree.

From what I can tell on this capture, you're seeing occasional packet loss, which is causing the TCP connections to go through normal retransmit protocols. The firewalls are getting in the way as a retransmit method both sides negotiated for is not being allowed.