better TCP performance over a “high delay network”

I’m trying to improve my TCP throughput over a “high delay network” between Linux machines.

I set tcp_mem, tcp_wmem and tcp_rmem to “8192 7061504 7061504”.
I set rmem_max, wmem_max, rmem_default and wmem_default to “7061504”.
I set netdev_max_backlog and txqueuelen to 10000.
I set tcp_congestion_control to “scalable”.

I’m using “nist” (cnistnet) to simulate a delay of 100ms, and the BW I reach is about 200mbps (without delay I reach about 790mbps).

I’m using iperf to perform the tests and TCPTrace to analyze the results, and here is what I got:

On the receiver side:
max win adv: 5294720 bytes
avg win adv: 5273959 bytes
sack pkts sent: 0

On the sender side:
actual data bytes: 3085179704
rexmt data bytes: 9018144
max owin: 5294577 bytes
avg owin: 3317125 bytes
RTT min: 19.2 ms
RTT max: 218.2 ms
RTT avg: 98.0 ms

Why do I reach only 200mbps? I suspect the “owin” has something to do with it, but I’m not sure (these results are of a test of 2 minute. A 1 minutes test had an “avg owin” of 1552900)…

Am I wrong to expect the throughput to be almost 790mbps even if the delay is 100ms?

(I tried using bigger numbers in the window configurations but it didn't seem to have an effect)


This is a common TCP issue called "Long Fat Pipe". If you Google that phrase and TCP you'll find a lot of information on this problem and possible solutions.

This thread has a bunch of calculations and suggestions on tuning the Linux TCP stack for this sort of thing.


The site

http://www.psc.edu/networking/projects/tcptune/

mentions that as Linux nowadays autotunes TCP settings, messing with the values will likely not improve things.

That being said, maybe 100 ms together with a large bandwidth (at least 790 mbps) might lead to an enormous BDP, so maybe the autotuning decides that something is wrong and doesn't go far enough..