Why jumbo frames affects the performance of the server

Update: Should I only set jumbo frame to server and file server, not client?

If so, is there any impact on communication between server and client?


I am running some performance test for our product.

Currently all the testing related machines(servers, file servers, clients, db) are on a 10G network connected by a powerful Dell OpenManage Switch.

We are using iscsi for the file server. We have a cluster server that contains several nodes.

The performance test I am running basically is to simulate the following scenario: 1. client machine will create a large number of threads to send http request to the server. 2. Based on the different type of requests, server needs to get some data from file server which is shared by all the other server nodes.

The test results is: Without jumbo frames, MTU 1500, server CPU 70%, and avg response time for the http request is 1 second.

With jumbo frames, MTU 9000, server CPU 20%, and avg response time for the http request is 5 seconds.

We have configured jumbo frames on all machines, and changed TCP settings.

Any ideas?


Solution 1:

Well:

  • Bigger frames = more data on each package = your CPU works less to send data (it has a smaller number of packages per second), but takes longer to assemble each payload (more latency).
  • Smaller frames = less data on each package = your cpu works more to send data (more packages per second), but takes less time to assemble each payload (less latency).

Solution 2:

I have been trying to read up and understand more about the impact of utilising Jumbo frames, and why it still hasn't become mainstream after more than a decade. This paper hints on the real world problems faced by Jumbo frame sizes, preventing it from achieving more than the bulk-file-transfer scenario.

http://www.chelsio.com/jumbo_enet_frames.html

Summary of issues contributing to high-latency delays

  1. Delayed pipelining across transmission mediums
  2. Small transmit/receive buffers causing dropped packets = retransmit
  3. Larger packet size = higher chance of collision = retransmit
  4. Lower CRC quality at greater payload lengths = corrupt packets = retransmit
  5. End-to-end path MTU discovery, both ways