Parsing IPv6 extension headers containing unknown extensions
I'm writing a very simple net filter, and getting to where I want to parse IPv6 headers to match things like ICMPv6 types, TCP/UDP port numbers, etc.
So I'm reading about the IPv6 packet format in depth, and I'm kind of like... well... I sort of had to read it over and over again to make sure I was actually reading it right. It looks to me that you have to start with the 40-byte fixed header and look at its next header field. Then you have to look at the next header's next header field, and so on, like a linked list, until you reach the end. If there's payload, it will follow.
The problem is that there is no length field either in the fixed header or the extension headers. You have to have a table of extension header types and their sizes so that you can chase this linked list to the end.
This strikes me as a strange, possibly even hare-brained design. What if I encounter an unrecognized extension header type? What do I do? I don't know its length. I guess I have to throw the packet out and block it, since in a net filter allowing the packet through would allow an attacker to evade the net filter by including a bogus header type. But that means that if the protocol is ever extended, every single piece of IPv6 header parsing software ever written must be simultaneously updated if the new extension is to be used.
So how can I parse IPv6 headers if I don't know the extensions they're using? How can I skip a header for an unknown extension, since I don't know its length?
What if I encounter an unrecognized extension header type?
From RFC 2460:
If, as a result of processing a header, a node is required to proceed to the next header but the Next Header value in the current header is unrecognized by the node, it should discard the packet and send an ICMP Parameter Problem message to the source of the packet, with an ICMP Code value of 1 ("unrecognized Next Header type encountered") and the ICMP Pointer field containing the offset of the unrecognized value within the original packet. The same action should be taken if a node encounters a Next Header value of zero in any header other than an IPv6 header.
If you run into something you cannot parse, you have to make your decision or perform your action based on what you've parsed already.
The design is that way because in IPv6, each extension header "wraps" the rest of the packet. If you see the routing header, then some header you've never heard of, then the payload, then you cannot parse the payload. The meaning of the payload depends in principle on the header you don't know how to interpret.
Routers can route such packets, because all they need is the routing header. Deep packet inspection gadgets and suchlike need to know a lot, but then that's their fate anyway.
Edited to add: This design means that middleboxes can only change what they know. If a middlebox sees a header it doesn't know, then it has only two options: Reject or pass on. In IPv4 it could also remove the unknown extension and pass on the rest. IMO this property makes the design more rather than less extensible.