Ubuntu HTTP latency falling into strange quantiles

I have an Ubuntu 10.10 server with plenty of RAM, bandwidth and CPU. I'm seeing a strange, repeatable pattern in the distribution of latencies when serving static files from both Apache and nginx. Because the problem is common to both http servers, I'm wondering if I have misconfigured or poorly tuned Ubuntu's networking or cache parameters.

ab -n 1000 -c 4 http://apache-host/static-file.jpg:

Percentage of the requests served within a certain time (ms)
  50%      5
  66%   3007
  75%   3009
  80%   3011
  90%   9021
  95%   9032
  98%  21068
  99%  45105
 100%  45105 (longest request)

ab -n 1000 -c 4 http://nginx-host/static-file.jpg:

Percentage of the requests served within a certain time (ms)
  50%     19
  66%     19
  75%   3011
  80%   3017
  90%   9021
  95%  12026
  98%  12028
  99%  18063
 100%  18063 (longest request)

The results consistently follow this kind of pattern - 50% or more of requests served as expected, then the remainder falling into discrete bands, with the slowest a few orders of magnitude slower.

Apache is 2.x and has mod_php installed. nginx is 1.0.x and has Passenger installed (but neither app server should be in the critical path for a static file). Load average was around 1 when each test was run (server has 12 physical cores). 5GB free ram, 7GB cached swap. Tests were run from localhost.

Here are the configuration changes I have made from Ubuntu server 10.10 defaults:

/etc/sysctl.conf:
    net.core.rmem_default = 65536
    net.core.wmem_default = 65536
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    net.ipv4.tcp_mem = 16777216 16777216 16777216
    net.ipv4.tcp_window_scaling = 1
    net.ipv4.route.flush = 1
    net.ipv4.tcp_no_metrics_save = 1
    net.ipv4.tcp_moderate_rcvbuf = 1
    net.core.somaxconn = 8192 

/etc/security/limits.conf:
    * hard nofile 65535
    * soft nofile 65535
    root hard nofile 65535
    root soft nofile 65535

other config:
    ifconfig eth0 txqueuelen 1000

Please let me know if this kind of problem rings any bells, or if more information about the config would be helpful. Thanks for your time.

Update: Here's what I'm seeing after increasing net.netfilter.nf_conntrack_max as suggested below:

Percentage of the requests served within a certain time (ms)
  50%      2
  66%      2
  75%      2
  80%      2
  90%      3
  95%      3
  98%      3
  99%      3
 100%      5 (longest request)

Solution 1:

Going off your comment that it was the nf_conntrack full problem, you can either increase the conntrak table:

sysctl -w net.netfilter.nf_conntrack_max=131072

Or if you are already behind a firewall you can just exempt HTTP traffic from connection tracking:

# iptables -L -t raw
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
NOTRACK    tcp  --  anywhere             anywhere            tcp dpt:www 
NOTRACK    tcp  --  anywhere             anywhere            tcp spt:www 

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
NOTRACK    tcp  --  anywhere             anywhere            tcp spt:www 
NOTRACK    tcp  --  anywhere             anywhere            tcp dpt:www

Ubuntu HTTP latency falling into strange quantiles

Solution 1:

Related

Recent Posts