What are the differences between channel bonding modes in Linux?
Under Linux you can combine multiple network interfaces into a "bonded" network interface to provide failover.
But there are several modes, some of which do not require switch support. I'm not constrained in my switch in that I can use any of the modes.
However, in reading about the different modes it's not immediately clear what the pros and cons of each one.
- Do some modes provide a faster failover?
- What about CPU load impact for each mode?
- Which modes can combine the bandwidth rather than just provide redundancy?
- Are there limitations to that?
- Does balance-rr require switch support?
- Reliability? What are your experiences running long term?
Solution 1:
The biggest factor in fail-over is the speed with which a link failure is detected. Unplug the cable from the host and they'll all work pretty well. Leave a live link on an otherwise dead switch and most of the modes (except for those that support beacons/keepalives) are going to send part of your traffic nowhere.
Generally speaking network traffic is interrupt driven. The various hashing algorithms aren't going to make a meaningful difference.
Any mode that isn't active/standby or broadcast-all will share traffic to varying degrees. Some modes can balance on a per packet basis, others work on a per-flow basis. The former will more evenly spread load while the latter is far more useful (read: functional/stable) in actual networks.
Yes - there are limitations to each mode, but we need to know a lot more about your application to speak to them.
Only LACP/802.3ad (mode 4) explicitly requires support on the switch. That said, just because you send to the switch with a particular pattern doesn't mean the switch will send -back- to you in the same manner.
The only mode I tend to trust in production is 802.3ad which, with an appropriately configured switch, will assure that only the correct links will end up in the channel as well as providing some measure of symmetry in traffic sharing and a predictable response when a link is down. This mode also avoids some common-but-nasty problems (i.e. unicast flooding). Active/standby is also quite common. The other modes may be required for certain circumstances but, IMO, tend to be more painful.
Other flow/MAC/IP based balancing modes or active/standby can be fine, too, and may be required when dealing with unmanaged switches.
Solution 2:
Most of these points are quite thoroughly described in the /usr/src/linux/Documentation/networking/bonding.txt
documentation file from the linux source package of your favorite distro. Speed of failover is controlled by the "miimon" parameter for most modes, but shouldn't be set too low; normal values are under one second anyway.
Here are the best parts, completed by me:
balance-rr or 0
Round-robin policy: Transmit packets in sequential
order from the first available slave through the
last. This mode provides load balancing and fault
tolerance.
active-backup or 1
Active-backup policy: Only one slave in the bond is
active. A different slave becomes active if, and only
if, the active slave fails. The bond's MAC address is
externally visible on only one port (network adapter)
to avoid confusing the switch.
This mode provides fault tolerance. The "primary"
option affects the behavior of this mode.
balance-xor or 2
XOR policy: Transmit based on the selected transmit
hash policy. The default policy is a simple [(source
MAC address XOR'd with destination MAC address) modulo
slave count]. Alternate transmit policies may be
selected via the xmit_hash_policy option.
This mode provides load balancing and fault tolerance.
broadcast or 3
Broadcast policy: transmits everything on all slave
interfaces. This mode provides fault tolerance.
802.3ad or 4
IEEE 802.3ad Dynamic link aggregation. Creates
aggregation groups that share the same speed and
duplex settings. Utilizes all slaves in the active
aggregator according to the 802.3ad specification.
Slave selection for outgoing traffic is done according
to the transmit hash policy, which may be changed from
the default simple XOR policy via the xmit_hash_policy
option. Note that not all transmit policies may be 802.3ad
compliant, particularly inregards to the packet mis-ordering
requirements of section 43.2.4 of the 802.3ad standard.
Differing peer implementations will have varying tolerances for
noncompliance.
Note: Most switches will require some type of configuration
to enable 802.3ad mode.
balance-tlb or 5
Adaptive transmit load balancing: channel bonding that
does not require any special switch support. The
outgoing traffic is distributed according to the
current load (computed relative to the speed) on each
slave. Incoming traffic is received by the current
slave. If the receiving slave fails, another slave
takes over the MAC address of the failed receiving
slave.
balance-alb or 6
Adaptive load balancing: includes balance-tlb plus
receive load balancing (rlb) for IPV4 traffic, and
does not require any special switch support.
When a link is reconnected or a new slave joins the
bond the receive traffic is redistributed among all
active slaves in the bond by initiating ARP Replies
with the selected MAC address to each of the
clients. The updelay parameter must
be set to a value equal or greater than the switch's
forwarding delay so that the ARP Replies sent to the
peers will not be blocked by the switch.
balance-rr, active-backup, balance-tlb and balance-alb don't need switch support.
balance-rr augments performance at the price of fragmentation, performs poorly with some protocols (CIFS) and with more than 2 interfaces.
balance-alb and balance-tlb may not work properly with all switches; there are often some arp problems (some machines may fail to connect to each other for instance). You may need to tweak various settings (miimon, updelay) to get stable networking.
balance-xor may or may not require switch configuration. You need to set up an interface group (not LACP) on HP and Cisco switches, but apparently it's not necessary on D-Link, Netgear and Fujitsu switches.
802.3ad absolutely requires an LACP group on the switch side. It's the best supported option overall for augmenting performance.
Note: whatever you do, one network connection always go through one and only one physical link. So when aggregating GigE interfaces, a file transfer from machine A to machine B can't top 1 gigabit/s, even if each machine has 4 aggregated GigE interfaces (whatever the bonding mode in use).
Solution 3:
The kernel docs answer some of those questions:
Ethernet bonding