Windows 2008 Server on mixed ethernet speeds — download from share slow, but fast plain TCP
Solution 1:
The problem was caused by:
- too small packet buffers in inexpensive gigabit switch;
- inadequate congestion avoidance algorithm used in Windows Server 2008 File Services;
- disabled flow control in network adapter (it was disabled by default).
Because flow control was disabled, Windows was sending packets up to window size in one batch using 1Gbps connection. Since 100Mbps client receive packets much more slowly, almost all data up to window size needed to be buffered by a switch. As this cheap switch has very small buffers (buffer sizes aren't even stated in specifications but it has to be less than 64kB per port, as even disabling window scaling did not help) it had to drop excess packets. Packet loss caused a delay of about 0.25s seen on a graph. But congestion avoidance algorithm, used in File Services, or lack thereof, did not reduce TCP window size, so the next batch of packets wasn't smaller — it congested connection again and again causing congestion collapse.
Standard TCP connections (not File Services) must use different congestion control algorithm and do not get congested repeatably. I suppose treating File Services specially by Windows TCP stack helps in benchmarks against for example Samba.
So the solutions are:
Enable flow control in network adapter properties. It isn't an ideal solution, as any File Services transfer to 100Mbps client will also slow down concurrent transfers to 1Gbps clients to less than 100Mbps speeds.
Or connect 100Mbps clients to an enterprise class switch with much bigger buffers. This is a solution I've used. I have a 10 year old "3Com SuperStack 3 3300 SM" switch with one 1000Base-SX fiber optic gigabit Ethernet MT-RJ port. I bought a Cisco 1000BASE-SX mini-Gbic module (MGBSX1) with LC port for my Linksys gigabit switch and LC/MT-RJ multi-mode fiber patchcord (about $150 for both) and connected all 100Mbps clients to this 3com switch. I've also enabled flow control but it should not cause slowdowns with no 100Mbps client connected.
Thanks to SpacemanSpiff, whose comments helped to resolve this.
Solution 2:
Does the Windows server have SMB Signing enabled? SMB signing adds slowness, and is enabled by default on Domain Controllers.
Solution 3:
Might it be the 100Mbps card/switch? You mention that the same client works properly when it is on 1Gbps.
Solution 4:
Feels like a lower-level network issue. My guesses:
- Duplex mismatch issues. It certainly would bring performance down quite a bit. On the Linux side, use the ethtool command to verify that your negotiating at 100 Mbps/Full Duplex. If your card negotiates at 100/Half, and the switch thinks it the connection is 100/Full, then there will be all types of problems. You might want to experiment with forcing 100/Full instead of auto-negotiating the speed (remember you have to force 100/Full at the switch and the system)
- It could also be a buffer issue either on the network card of the client, or the switch. I've seen Network card drivers not allocate enough buffer space, and cause issues with speed. I imagine the same type of problem could happen on the switch. Far harder to diagnose, other then swapping equipment.