Load balancing network traffic using iptables

I am trying to load balance traffic from internal LAN on a linux router having two gateways. Initially I went for the iproute implementation which didnt balance the load as expected, reason being that routes are cached.

Now I am using iptables to mark every new connection using CONNMARK and then adding rules to route these marked connections over different gateways.

Eth0 - LAN, Eth1 - ISP1, Eth2 - ISP2

Following is the script I am using,


echo 1 >| /proc/sys/net/ipv4/ip_forward
echo 0 >| /proc/sys/net/ipv4/conf/all/rp_filter

#   flush all iptables entries
iptables -t filter -F
iptables -t filter -X
iptables -t nat -F
iptables -t nat -X
iptables -t mangle -F
iptables -t mangle -X
iptables -t filter -P INPUT ACCEPT
iptables -t filter -P OUTPUT ACCEPT
iptables -t filter -P FORWARD ACCEPT

# initialise chains that will do the work and log the packets
iptables -t mangle -N CONNMARK1
iptables -t mangle -A CONNMARK1 -j MARK --set-mark 1
iptables -t mangle -A CONNMARK1 -j CONNMARK --save-mark
iptables -t mangle -A CONNMARK1 -j LOG --log-prefix 'iptables-mark1: ' --log-level info

iptables -t mangle -N CONNMARK2
iptables -t mangle -A CONNMARK2 -j MARK --set-mark 2
iptables -t mangle -A CONNMARK2 -j CONNMARK --save-mark
iptables -t mangle -A CONNMARK2 -j LOG --log-prefix 'iptables-mark2: ' --log-level info

iptables -t mangle -N RESTOREMARK
iptables -t mangle -A RESTOREMARK -j CONNMARK --restore-mark
iptables -t mangle -A RESTOREMARK -j LOG --log-prefix 'restore-mark: ' --log-level info

iptables -t nat -N SNAT1
iptables -t nat -A SNAT1 -j LOG --log-prefix 'snat-to- ' --log-level info
iptables -t nat -A SNAT1 -j SNAT --to-source

iptables -t nat -N SNAT2
iptables -t nat -A SNAT2 -j LOG --log-prefix 'snat-to- ' --log-level info
iptables -t nat -A SNAT2 -j SNAT --to-source

# restore the fwmark on packets that belong to an existing connection
iptables -t mangle -A PREROUTING -i eth0 \

# if the mark is zero it means the packet does not belong to an existing connection
iptables -t mangle -A PREROUTING -m state --state NEW \
     -m statistic --mode nth --every 2 --packet 0 -j CONNMARK1
iptables -t mangle -A PREROUTING -m state --state NEW \
     -m statistic --mode nth --every 2 --packet 1 -j CONNMARK2

iptables -t nat -A POSTROUTING -o eth1 -j SNAT1
iptables -t nat -A POSTROUTING -o eth2 -j SNAT2

if ! cat /etc/iproute2/rt_tables | grep -q '^51'
    echo '51     rt_link1' >> /etc/iproute2/rt_tables

if ! cat /etc/iproute2/rt_tables | grep -q '^52'
    echo '52     rt_link2' >> /etc/iproute2/rt_tables

ip route flush table rt_link1 2>/dev/null
ip route add dev eth1 src table rt_link1
ip route add default via table rt_link1
ip route flush table rt_link2 2>/dev/null
ip route add dev eth2 src table rt_link2
ip route add default via table rt_link2

ip rule del from all fwmark 0x1 lookup rt_link1 2>/dev/null
ip rule del from all fwmark 0x2 lookup rt_link2 2>/dev/null
ip rule del from all fwmark 0x2 2>/dev/null
ip rule del from all fwmark 0x1 2>/dev/null

ip rule add fwmark 1 table rt_link1
ip rule add fwmark 2 table rt_link2

ip route flush cache

Using this connections do get routed over both the routes. However, some of them get dropped ie.connections do not get through . In some cases an established connection gets disrupted midway.

Am I missing something ?

Solution 1:

Here's another approach. Instead of marking connections based upon packet count and hoping they don't get reinitialized, duplicated, or otherwise altered, just divide the packets up by source or destination IP. For any sufficiently large set of connections, you should have about a 50-50 spread.

I'm posting the following as a drop-in replacement, but you can probably do away with the CONNMARK logic altogether with a bit more tinkering.

iptables -t mangle -A PREROUTING -m state --state NEW \
    -d -j CONNMARK1
iptables -t mangle -A PREROUTING -m state --state NEW \
    -d -j CONNMARK2

You could also change the destination to source if there's more variance in source IPs, or even combine them into a bracket. (odd/odd or even/even are CONNMARK1, odd/even or even/odd are CONNMARK2).