Publicly routable IPv6 linux container
My goal is to have a routable public IPv6 address for each of my docker containers. I want to be able to connect into and out of my containers using the IPv6 protocol.
I'm using Linode and I've been assigned a public IPv6 pool:
2600:3c01:e000:00e2:: / 64 routed to 2600:3c01::f03c:91ff:feae:d7d7
That "routed to" address was auto-configured by dhcp:
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
valid_lft 2591987sec preferred_lft 604787sec
inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
valid_lft forever preferred_lft forever
I setup an AAAA record for ipv6.daaku.org
to make it easier to work with:
# nslookup -q=AAAA ipv6.daaku.org
ipv6.daaku.org has AAAA address 2600:3c01:e000:e2::1
To test, I assigned that address manually:
# ip -6 addr add 2600:3c01:e000:00e2::1/64 dev eth0
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2600:3c01:e000:e2::1/64 scope global
valid_lft forever preferred_lft forever
inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
valid_lft 2591984sec preferred_lft 604784sec
inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
valid_lft forever preferred_lft forever
I can now ping this from my IPv6 enabled home network:
# ping6 -c3 ipv6.daaku.org
PING6(56=40+8+8 bytes) 2601:9:400:12ab:1db7:a353:a7b4:c192 --> 2600:3c01:e000:e2::1
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=0 hlim=54 time=16.855 ms
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=1 hlim=54 time=19.506 ms
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=2 hlim=54 time=17.467 ms
--- ipv6.daaku.org ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 16.855/17.943/19.506/1.133 ms
I removed the address because I want it in the container only, and went back to the original state:
# ip -6 addr del 2600:3c01:e000:00e2::1/64 dev eth0
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
valid_lft 2591987sec preferred_lft 604787sec
inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
valid_lft forever preferred_lft forever
I started a docker container without a network in another terminal:
# docker run -it --rm --net=none debian bash
root@b96ea38f03b3:/#
Stuck it's pid in a variable for ease of use:
CONTAINER_PID=$(docker inspect -f '{{.State.Pid}}' b96ea38f03b3)
Setup the netns for that pid:
# mkdir -p /run/netns
# ln -s /proc/$CONTAINER_PID/ns/net /run/netns/$CONTAINER_PID
Created a new device, assigned it the IP:
# ip link add container0 link eth0 type macvlan
# ip link set container0 netns $CONTAINER_PID
# ip netns exec $CONTAINER_PID ip link set dev container0 name eth0
# ip netns exec $CONTAINER_PID ip link set eth0 up
# ip netns exec $CONTAINER_PID ip addr add 2600:3c01:e000:00e2::1/64 dev eth0
Back in the other terminal where I started the container:
# ip -6 addr show eth0
22: eth0@gre0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500
inet6 2600:3c01::a083:1eff:fea5:5ad2/64 scope global dynamic
valid_lft 2591979sec preferred_lft 604779sec
inet6 2600:3c01:e000:e2::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::a083:1eff:fea5:5ad2/64 scope link
valid_lft forever preferred_lft forever
# ip -6 route
2600:3c01::/64 dev eth0 proto kernel metric 256 expires 2591976sec
2600:3c01:e000:e2::/64 dev eth0 proto kernel metric 256
fe80::/64 dev eth0 proto kernel metric 256
default via fe80::1 dev eth0 proto ra metric 1024 expires 67sec
This doesn't work and I can't connect out from the container (using ping6 ipv6.google.com
as my test) nor can I ping the container from across the internet from my home network (using ping6 ipv6.daaku.org
as my test).
Update: I managed to get outgoing IPv6 working by doing this:
ip -6 addr add 2600:3c01:e000:00e2::1111:1/112 dev docker0 &&
ip6tables -P FORWARD ACCEPT &&
sysctl -w net.ipv6.conf.all.forwarding=1 &&
sysctl -w net.ipv6.conf.all.proxy_ndp=1
CONTAINER_PID=$(docker inspect -f '{{.State.Pid}}' 4fd3b05a04bb)
mkdir -p /run/netns &&
ln -s /proc/$CONTAINER_PID/ns/net /run/netns/$CONTAINER_PID &&
ip netns exec $CONTAINER_PID ip -6 addr add 2600:3c01:e000:00e2::1111:20/112 dev eth0 &&
ip netns exec $CONTAINER_PID ip -6 route add default via 2600:3c01:e000:00e2::1111:1 dev eth0
IPv6 routes on the host:
# ip -6 r
2600:3c01::/64 dev eth0 proto kernel metric 256 expires 2582567sec
2600:3c01:e000:e2::1111:0/112 dev docker0 proto kernel metric 256
2600:3c01:e000:e2::/64 dev eth0 proto kernel metric 256
fe80::/64 dev eth0 proto kernel metric 256
fe80::/64 dev docker0 proto kernel metric 256
fe80::/64 dev veth1775864 proto kernel metric 256
fe80::/64 dev veth102096c proto kernel metric 256
fe80::/64 dev vethdf3a55b proto kernel metric 256
IPv6 routes in the container:
# ip -6 r
2600:3c01:e000:e2::1111:0/112 dev eth0 proto kernel metric 256
fe80::/64 dev eth0 proto kernel metric 256
default via 2600:3c01:e000:e2::1111:1 dev eth0 metric 1024
Still can't ping it from my home machine.
I think your problem is routing-related. The trouble is that you've been assigned a flat /64
, but you've decided to sub-subnet off a /112
. That's fine for outbound traffic, because your container host knows about all the individual sub-subnets, but when your ISP comes to handle the return packets, they don't know that you've subsectioned off 2600:3c01:e000:e2::1111:0/112
and that that should be routed via 2600:3c01:e000:00e2::1
. They just expect the whole of 2600:3c01:e000:00e2::/64
to be sitting there, directly-connected and accessible via unicast.
The problem is that there's no mechanism to tell your ISP that you've decided to start sub-subnetting (actually, that's a lie, there are a number of ways - but they all require the cooperation of your ISP). Your simplest bet is probably to stop routing the traffic to the containers, and start bridging it.
I can't tell you exactly how to do that. I tried, and several people kindly pointed out that I was wrong. Hopefully someone can clarify. But the problem remains that you need to bridge your containers to your next-hop-route, and vice-versa, rather than routing them.
In docker 1.0, there are two options for enabling IPv6 connectivity to docker containers. I had to use the lxc driver rather than libcontainer to get both of these methods to work. You may be able to use RADVD. I didn't attempt it.
1) Have the provider route the /64 to your docker host. This is a easiest option. Enable IPv6 forwarding and assign the /64 to docker0. You don't have to break up this network into smaller ones (i.e., /112) unless you have multiple docker bridges or multiple docker hosts.
This method is discussed in depth in Andreas Neuhaus's blog post "IPv6 in Docker Containers". See http://zargony.com/2013/10/13/ipv6-in-docker-containers.
Note that very few IPv6-enabled IaaS providers will route a /64 to a VM. The second method overcomes this limitation in a semi-kludgy way.
2) Use a subset of the /64 from the LAN interface on the docker bridge - This method doesn't require a /64 routed to your docker host. A smaller network in the /64 (e.g., /112) on the LAN is assigned to docker0. NDP is configured to proxy NDP from the docker bridge to your LAN interface (probably eth0).
I wrote a detailed description of this method at http://jeffloughridge.wordpress.com/2014/07/22/ipv6-in-docker-containers-on-digitalocean/.
I haven't used docker versions greater than 1.0. It is possible that things have changed in newer releases.