Publicly routable IPv6 linux container

My goal is to have a routable public IPv6 address for each of my docker containers. I want to be able to connect into and out of my containers using the IPv6 protocol.

I'm using Linode and I've been assigned a public IPv6 pool:

2600:3c01:e000:00e2:: / 64 routed to 2600:3c01::f03c:91ff:feae:d7d7

That "routed to" address was auto-configured by dhcp:

# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
       valid_lft 2591987sec preferred_lft 604787sec
    inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
       valid_lft forever preferred_lft forever

I setup an AAAA record for ipv6.daaku.org to make it easier to work with:

# nslookup -q=AAAA ipv6.daaku.org
ipv6.daaku.org  has AAAA address 2600:3c01:e000:e2::1

To test, I assigned that address manually:

# ip -6 addr add 2600:3c01:e000:00e2::1/64 dev eth0
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2600:3c01:e000:e2::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
       valid_lft 2591984sec preferred_lft 604784sec
    inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
       valid_lft forever preferred_lft forever

I can now ping this from my IPv6 enabled home network:

# ping6 -c3 ipv6.daaku.org
PING6(56=40+8+8 bytes) 2601:9:400:12ab:1db7:a353:a7b4:c192 --> 2600:3c01:e000:e2::1
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=0 hlim=54 time=16.855 ms
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=1 hlim=54 time=19.506 ms
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=2 hlim=54 time=17.467 ms

--- ipv6.daaku.org ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 16.855/17.943/19.506/1.133 ms

I removed the address because I want it in the container only, and went back to the original state:

# ip -6 addr del 2600:3c01:e000:00e2::1/64 dev eth0
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
       valid_lft 2591987sec preferred_lft 604787sec
    inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
       valid_lft forever preferred_lft forever

I started a docker container without a network in another terminal:

# docker run -it --rm --net=none debian bash
root@b96ea38f03b3:/#

Stuck it's pid in a variable for ease of use:

CONTAINER_PID=$(docker inspect -f '{{.State.Pid}}' b96ea38f03b3)

Setup the netns for that pid:

# mkdir -p /run/netns
# ln -s /proc/$CONTAINER_PID/ns/net /run/netns/$CONTAINER_PID

Created a new device, assigned it the IP:

# ip link add container0 link eth0 type macvlan
# ip link set container0 netns $CONTAINER_PID
# ip netns exec $CONTAINER_PID ip link set dev container0 name eth0
# ip netns exec $CONTAINER_PID ip link set eth0 up
# ip netns exec $CONTAINER_PID ip addr add 2600:3c01:e000:00e2::1/64 dev eth0

Back in the other terminal where I started the container:

# ip -6 addr show eth0
22: eth0@gre0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500
    inet6 2600:3c01::a083:1eff:fea5:5ad2/64 scope global dynamic
       valid_lft 2591979sec preferred_lft 604779sec
    inet6 2600:3c01:e000:e2::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::a083:1eff:fea5:5ad2/64 scope link
       valid_lft forever preferred_lft forever

# ip -6 route
2600:3c01::/64 dev eth0  proto kernel  metric 256  expires 2591976sec
2600:3c01:e000:e2::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
default via fe80::1 dev eth0  proto ra  metric 1024  expires 67sec

This doesn't work and I can't connect out from the container (using ping6 ipv6.google.com as my test) nor can I ping the container from across the internet from my home network (using ping6 ipv6.daaku.org as my test).

Update: I managed to get outgoing IPv6 working by doing this:

ip -6 addr add 2600:3c01:e000:00e2::1111:1/112 dev docker0 &&
ip6tables -P FORWARD ACCEPT &&
sysctl -w net.ipv6.conf.all.forwarding=1 &&
sysctl -w net.ipv6.conf.all.proxy_ndp=1

CONTAINER_PID=$(docker inspect -f '{{.State.Pid}}' 4fd3b05a04bb)
mkdir -p /run/netns &&
ln -s /proc/$CONTAINER_PID/ns/net /run/netns/$CONTAINER_PID &&
ip netns exec $CONTAINER_PID ip -6 addr add 2600:3c01:e000:00e2::1111:20/112 dev eth0 &&
ip netns exec $CONTAINER_PID ip -6 route add default via 2600:3c01:e000:00e2::1111:1 dev eth0

IPv6 routes on the host:

# ip -6 r
2600:3c01::/64 dev eth0  proto kernel  metric 256  expires 2582567sec
2600:3c01:e000:e2::1111:0/112 dev docker0  proto kernel  metric 256
2600:3c01:e000:e2::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev docker0  proto kernel  metric 256
fe80::/64 dev veth1775864  proto kernel  metric 256
fe80::/64 dev veth102096c  proto kernel  metric 256
fe80::/64 dev vethdf3a55b  proto kernel  metric 256

IPv6 routes in the container:

# ip -6 r
2600:3c01:e000:e2::1111:0/112 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
default via 2600:3c01:e000:e2::1111:1 dev eth0  metric 1024

Still can't ping it from my home machine.

I think your problem is routing-related. The trouble is that you've been assigned a flat /64, but you've decided to sub-subnet off a /112. That's fine for outbound traffic, because your container host knows about all the individual sub-subnets, but when your ISP comes to handle the return packets, they don't know that you've subsectioned off 2600:3c01:e000:e2::1111:0/112 and that that should be routed via 2600:3c01:e000:00e2::1. They just expect the whole of 2600:3c01:e000:00e2::/64 to be sitting there, directly-connected and accessible via unicast.

The problem is that there's no mechanism to tell your ISP that you've decided to start sub-subnetting (actually, that's a lie, there are a number of ways - but they all require the cooperation of your ISP). Your simplest bet is probably to stop routing the traffic to the containers, and start bridging it.

I can't tell you exactly how to do that. I tried, and several people kindly pointed out that I was wrong. Hopefully someone can clarify. But the problem remains that you need to bridge your containers to your next-hop-route, and vice-versa, rather than routing them.

In docker 1.0, there are two options for enabling IPv6 connectivity to docker containers. I had to use the lxc driver rather than libcontainer to get both of these methods to work. You may be able to use RADVD. I didn't attempt it.

1) Have the provider route the /64 to your docker host. This is a easiest option. Enable IPv6 forwarding and assign the /64 to docker0. You don't have to break up this network into smaller ones (i.e., /112) unless you have multiple docker bridges or multiple docker hosts.

This method is discussed in depth in Andreas Neuhaus's blog post "IPv6 in Docker Containers". See http://zargony.com/2013/10/13/ipv6-in-docker-containers.

Note that very few IPv6-enabled IaaS providers will route a /64 to a VM. The second method overcomes this limitation in a semi-kludgy way.

2) Use a subset of the /64 from the LAN interface on the docker bridge - This method doesn't require a /64 routed to your docker host. A smaller network in the /64 (e.g., /112) on the LAN is assigned to docker0. NDP is configured to proxy NDP from the docker bridge to your LAN interface (probably eth0).

I wrote a detailed description of this method at http://jeffloughridge.wordpress.com/2014/07/22/ipv6-in-docker-containers-on-digitalocean/.

I haven't used docker versions greater than 1.0. It is possible that things have changed in newer releases.

Publicly routable IPv6 linux container

Related

Recent Posts