Why is conntrackd not replicating state?

I have a problem with an active/active firewall cluster where the connection tracking state in the firewall does not seem be be being replicated.

It's active/active because I have two routers connected via different ISP's and a network range that is provided through BGP. How the data is routed back is determined by BGP. Therefore the routing is asymmetric. These two firewalls are networked together on the inside network and I have a virtual IP acting as a default route for windows servers.

When both firewall's are running and an inside server tries to connect, the reply comes back via the secondary firewall (the one which has no record of the connection state). Therefore the reply is dropped and not routed to the server that initiated the request.

I thought conntrackd would fix this but I can't seem to get it to work. Perhaps I misunderstand how it works. Can I get conntrackd to replicate iptables state at all? Does it actually work in active/active mode? Is state replicated in real time?

Here are what my conntrackd.conf file contains.

Sync {
  Mode ALARM {
    RefreshTime 15
    CacheTimeout 180
  }

  Multicast {
    IPv4_Address 225.0.0.50
    Group 3780
    IPv4_Interface 10.0.0.100
    Interface eth2
    SndSocketBuffer 1249280
    RcvSocketBuffer 1249280
    Checksum on
  }
}

General {
  Nice -20
  HashSize 32768
  HashLimit 131072
  LogFile on
  Syslog on
  LockFile /var/lock/conntrack.lock
  UNIX {
    Path /var/run/conntrackd.ctl
    Backlog 20
  }
  NetlinkBufferSize 2097152
  NetlinkBufferSizeMaxGrowth 8388608
  Filter From Userspace {
    Protocol Accept {
      TCP
    }

    Address Ignore {
      IPv4_address 127.0.0.1 # loopback
      IPv4_address 10.0.0.100 # dedicated link0
      IPv4_address 10.0.0.101 # dedicated link1
      IPv4_address x.x.x.130 # Internal ip
    }
  }
}

The other conntrackd is the same apart from the IPv4_interface in the multicast section which has 10.0.0.101. And the internal IP in the filter section ends in 131

I have set firewall rules to accept input to 225.0.0.50/32 & output to 225.0.0.50/32.

I've set mode to ALARM here but first tried FTFW. Neither seems to work.

My kernel version is: 3.11.0.

Sorry, my cut and paste isn't working from the Virtual box window. However, let me just say that when I run: sudo conntrackd -i it lists as output an ESTABLISHED tcp connection which is one that I created with ssh going in.

However, on the other router the same command produces no output. Which I think should mean that the state didn't get transferred across onto the other router.

Any ideas?


Update: I ran tcpdump -i eth2 on each machine and I can see UDP packets arriving locally from the other router that were destined for the multicast address 225.0.0.50 port 3780 with a length of 68 bytes.

If I initiate an ssh connection I see immediate activity on tcpdump, and disconnecting does the same. Otherwise regular heartbeats of that message come through. So it's clear that the routers are sending the packets, but is conntrackd ignoring them? Is there some hidden debug I can turn on?


Update2: Ok, after days of googling and looking at source code I have discovered that conntrackd is replicating the state but it ends up in an external cache. To commit the rules you need to run conntrackd -c. Clearly conntrackd is designed to be used in an active/backup mode.

It seems a new option was introduced at some point called CacheWriteThrough. But was then removed. Can conntrack do active/active or not? I can't seem to find an answer to that.


Ok, after days of frustration and little documentation and even reading source code. I've figured it out.

Mode FTFW {
     [...]
     DisableExternalCache On
}

Disabling the external cache is what you need for an asymmetric routing scenario. Otherwise for active/backup you want to use the default off and set notify_master, notify_backup, notify_fault settings in keepalived.

The setting CacheWriteThrough was removed and replaced with DisableExternalCache.

Those scripts are used to commit the external connection state cache to the router holding the IP. With DisableExternalCache On they shouldn't be needed because the state is already committed.