Why iperf still reports 1Gbps performance when using bonding over two 1Gbps adapters?

Solution 1:

Bonded interfaces do not grant additional bandwidth to individual network flows. So if you're only running one copy of iperf then you will only be able to use one network interface at a time. If you have two NIC in a lagg then you'll need at least two completely independent copies of iperf running on the computer to see any simultaneous utilization. This will apply to actual loads as well - eg a Samba client will still only see 1Gb throughput, but two clients could each see 1Gb if your lagg has two NICs. This all assumes you have the lagg configured to use both NICs (The 802.3ad option will do this).

Solution 2:

After contacting Netgear support, it appears that:

If you use 2 stations (1 client/1 server), it will actually only use one link (hence the 1Gbps/940mbps), the link used is decided by the LACP hashing algorithm.

To go above the 1Gbps limit, you will need to test with more that 1 client.

Source: Netgear support ticket response

The same ticket response links to Netgear's public forum post, where we can read that:

You can only get 2Gbps aggregate when the LACP hashing algorithm puts multiple traffic streams down different paths and it doesn't always. With a small number of clients (2 in your case), odds are good that they both might get hashed to the same link.

For those who don't want to read the entire forum discussion, here are the key points:

  • There should be at least two clients connecting to the server to benefit from LACP. A single client will use one link only, which will limit its speed to 1 Gbps.

  • Two clients should be using different links to benefit from LACP.

  • With only two network adapters on the server, there is a 50% chance of getting the same link from two clients, which will result in total speed capped at 1 Gbps. Three network adapters decrease the chance down to 33%, four—to 25%.

To conclude, there is no way with Netgear GS728TS to obtain a 1.4 to 1.8 Gbps speed between two machines.

Solution 3:

This Q&A was very helpful for me to understand bonding with LACP but there is no concrete example how to verify a throughput of about 1.8Gb/s. For me it was important to verify this so I will share how I have tested it.

As @ChrisS noted in his answer it is important to have completely independent copies of iperf running. To achieve this I connect to the lacp-server with two clients. On the lacp-server I use screen to run independent instances of iperf in two screen windows/sessions. I also ensure to have independent data streams by using different ports for each connection. My switch with bonding LACP to the server is a TP-LINK T1600G-52TS. All devices uses Debian 10 (Buster). The two test clients are connected to a port of the switch. First I started iperf in server mode on the lacp-server two times within screen and then executed on the clients at the same time (using ssh):

iperf --time 30 --port 5001 --client lacp-server   # first test client
iperf --time 30 --port 5002 --client lacp-server   # second test client

Here are the results on the lacp-server for the first connection:

lacp-server ~$ iperf -s -p 5001
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.10.11 port 5001 connected with 192.168.10.69 port 44120
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-30.0 sec  2.99 GBytes   855 Mbits/sec

and for the second connection:

lacp-server ~$ iperf -s -p 5002
------------------------------------------------------------
Server listening on TCP port 5002
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.10.11 port 5002 connected with 192.168.10.80 port 48930
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-30.0 sec  3.17 GBytes   906 Mbits/sec

Together this is a Bandwidth of 855Mb/s + 906Mb/s = 1.761Mb/s.

@ArseniMourzenko noted in his answer:

With only two network adapters on the server, there is a 50% chance of getting the same link from two clients, which will result in total speed capped at 1 Gbps. Three network adapters decrease the chance down to 33%, four—to 25%.

I have repeated the test more than 10 times to verify this but always get a Bandwidth of about 1.8Gb/s so I cannot confirm this.

The statistics of the interfaces shows that its usage is balanced:

lacp-server ~$ ip -statistics link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    RX: bytes  packets  errors  dropped overrun mcast
    3088       30       0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    3088       30       0       0       0       0
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
    link/ether 5e:fb:29:44:e9:cd brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast
    39231276928 25845127 0       0       0       916
    TX: bytes  packets  errors  dropped carrier collsns
    235146272  3359187  0       0       0       0
3: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
    link/ether 5e:fb:29:44:e9:cd brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast
    36959564721 24351697 0       0       0       60
    TX: bytes  packets  errors  dropped carrier collsns
    267208437  3816988  0       0       0       0
4: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 5e:fb:29:44:e9:cd brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast
    69334437898 50196824 0       4253    0       976
    TX: bytes  packets  errors  dropped carrier collsns
    502354709  7176175  0       0       0       0

With three test clients I get these results:

  • 522 Mb/s + 867 Mb/s + 486 Mb/s = 1.875 Mb/s
  • 541 Mb/s + 863 Mb/s + 571 Mb/s = 1.975 Mb/s
  • 534 Mb/s + 858 Mb/s + 447 Mb/s = 1.839 Mb/s
  • 443 Mb/s + 807 Mb/s + 606 Mb/s = 1.856 Mb/s
  • 483 Mb/s + 805 Mb/s + 512 Mb/s = 1.800 Mb/s


References:
Link Aggregation and LACP basics
LACP bonding and Linux configuration
Linux Ethernet Bonding Driver HOWTO
RedHat - Using Channel Bonding