How does anycast work with tcp?

TCP, being stateful, should require subsequent packets to reach the same server. (Stateless) HTTP runs on top of TCP, and CDN's can use anycast.

So how does TCP work with anycast? What if the syn and the ack go to different servers? I think I've heard Google has some solution to this, but I'm not sure.

Please answer for both IPv4 and IPv6, if there's any difference.


This is one of those many challenges, which can be approached in many different ways. The simplest approach is to ignore it and hope for the best. As long as routing doesn't change mid-connection, it will be fine. But when routing does change, it will break all those connections affected by the routing change. The other answers already go into more depth with this approach.

Another approach is to track where connections are routed to. If a packet gets routed to the wrong POP, the CDN can tunnel the packet to the right POP for further processing. This does introduce additional overhead, the client will experience increased latency once it happens. This increased latency will persist for the lifetime of the connection. But it is likely better for the user experience than a broken connection.

In terms of bandwidth consumption, the overhead is not very significant, because it affects only packets in one direction, and that tends to be the direction with the smallest bandwidth usage.

The tracking could be done at connection level or by tracking which is the preferred POP to be serving each individual client IP address. The most obvious data structure for tracking the connections would be a distributed hash table.

If the client supports MPTCP, there is another solution, which could be used. As soon as the connection has been established, the server will open another subflow using a unicast IP address. If such a subflow is successfully established, then the connection can survive change of routing of the anycast address by simply using the unicast address for the remaining lifetime of the connection.

In principle all of the above approaches would be the same for IPv4 and IPv6. But in practice some solutions may not work as well on IPv4 due to shortage of IP addresses.

For example the MPTCP approach does require each server to have a public IP address in order to work well. A large load balancing setup might have too many servers to assign a public IP address to each. Additionally establishing the new subflow cannot be initiated by the server, if the client is behind a NAT, which is often the case with IPv4. That means the server would instead have to send the unicast IP address as an option over the initial subflow and let the client initiate the extra subflow.

I don't know which of the above approaches have been used by CDNs.


Anycast is best described as a "one-to-nearest" routing scheme, and typically works by having BGP (Border Gateway Protocol) announce destination IPs from multiple sources, resulting in the packets being routed to the nearest of the destination IPs listed.

So in the broad sense, anycast is just used to figure out which server to connect to, and there's nothing about it that makes it unsuited to TCP, or stateful networking.

The primary use case for anycast is in CDNs (Content Delivery Networks), which generally have short-lived and/or stateless connections - as you'd expect when delivering lots of small, static webpage content. In this use-case, anycast's assumption that the network topology will remain the same for at least the length of the session is a fairly safe assumption given the short length of the typical session, and the minimal consequences of that assumption becoming false - worst case, the session fails in the middle, and the user reloads the webpage.

The drawback of using anycast for longer sessions, or for uses which are intolerant of disruptions is that the network topology is more likely to change during a longer timeframe, and the connection will silently break if that happens. (Pop-switching.) As you allude to in your question, Google (and others) are working on proprietary methods of solving this problem, but for now, it's all proprietary and secret.

So the answer to your question of how anycast works with TCP is really that it works just fine, unless the network topology changes... if the topology changes, it [potentially] breaks.

There's an interesting presentation here (warning, pdf) with real world data about the use of anycast, including some long-lived sessions, and it would seem that in the real world, "pop switching" (where the network topology changes in the middle of a session and breaks a connection) is a very uncommon experience - in one dataset, with 683,204 sessions, and 23,795 sessions longer than 10 minutes, only 4 sessions got pop switched.


It works better than you expect, especially for TCP sessions that are usually pretty short-lived such as those generated by HTTP clients.

Anycast assumes that the network topology isn't going to change for the duration of the session, and if it does change it isn't likely that another endpoint will suddenly be nearer than the one that negotiated the session. The application protocol should handle this sort of disconnect/reconnect activity.

CDNs work very well on Anycast since their whole business model is short-lived TCP sessions with significantly unidirectional network transfer out of their network. If the ACK stream ends up going somewhere other than the endpoint it originally negotiated for, the connection will hang for that one asset.