BPF verifier fails because of invalid access to packet

Solution 1:

I think I have found the core of the issue. The verifier tracks a couple of properties about variables which allow it to determine if the program might access data it shouldn't. One of these properties is umax_value which tracks that the max unsigned int value is, which might be dynamic.

Since packets have finite size the verifier asserts that the umax_value of an offset into the packet may never exceed MAX_PACKET_OFF(65536).

Every time the program loops we add ext_len to data, since ext_len is a __u16 its max uint value is 65536 by default. The program limits this to 30000 with the following statement:

if (ext_len > 30000) {
  goto end;
}

However, the umax_value of data accumulates over each iteration. We can see this in the verifier log:

; if (data_end < (data + sizeof(struct extension))) { 13: (2d) if r5 > r2 goto pc+18 R0_w=pkt(id=22,off=90,r=0,umax_value=66000,var_off=(0x0; 0xffffffff)) R1=inv(id=0) R2=pkt_end(id=0,off=0,imm=0) R3_w=inv(id=0,umin_value=88,umax_value=66088,var_off=(0x0; 0x1ffff),s32_min_value=0,s32_max_value=131071,u32_max_value=131071) R4=inv17179869184 R5_w=pkt(id=22,off=94,r=0,umax_value=66000,var_off=(0x0; 0xffffffff)) R6_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R7_w=inv(id=0) R10=fp0 ; if (ext->type == SERVER_NAME_EXTENSION) { 14: (71) r6 = *(u8 *)(r0 +0)

umax_value is larger then 65536, thus the error.

Now, to fix this we need to change the code so data can't exceed 65536. We do this by specifying a maximum amount of iterations(extensions) and setting the max size of each extension. I modified the program to add these constrains, I chose max 32 extensions and max 2048 bytes per extension which seems sane values (32 * 2048 = 65536), these can be changed.

#include <stddef.h>
#include <linux/bpf.h>
#include "./bpf_endian.h"

#define SEC(NAME) __attribute__((section(NAME), used))

struct server_name
{
    char server_name[256];
};

struct extension
{
    __u16 type;
    __u16 len;
} __attribute__((packed));

struct sni_extension
{
    __u16 list_len;
    __u8 type;
    __u16 len;
} __attribute__((packed));

#define SERVER_NAME_EXTENSION 0

SEC("xdp")
int collect_ips_prog(struct xdp_md *ctx)
{
    void *data_end = (void *)(long)ctx->data_end;
    void *data = (void *)(long)ctx->data;
    void *cursor = (void *)(long)ctx->data;

    if (data_end < (cursor + sizeof(__u16)))
    {
        goto end;
    }

    __s64 extension_method_len = *(__u16 *)cursor;
    if (extension_method_len < 0)
    {
        goto end;
    }

    cursor += sizeof(__u16);

    for (int i = 0; i < 32; i++)
    {
        struct extension *ext;

        if (cursor > extension_method_len + data)
        {
            goto end;
        }

        if (data_end < (cursor + sizeof(*ext)))
        {
            goto end;
        }

        ext = (struct extension *)cursor;

        cursor += sizeof(*ext);

        if (ext->type == SERVER_NAME_EXTENSION)
        {
            struct server_name sn;

            if (data_end < (cursor + sizeof(struct sni_extension)))
            {
                goto end;
            }

            struct sni_extension *sni = (struct sni_extension *)cursor;

            cursor += sizeof(struct sni_extension);

            __u16 server_name_len = sni->len;

            for (int sn_idx = 0; sn_idx < server_name_len; sn_idx++)
            {
                if (data_end < cursor + sn_idx)
                {
                    goto end;
                }

                if (sn.server_name + sizeof(struct server_name) < sn.server_name + sn_idx)
                {
                    goto end;
                }

                sn.server_name[sn_idx] = ((char *)cursor)[sn_idx];
            }

            sn.server_name[server_name_len] = 0;
            goto end;
        }

        if (ext->len > 2048)
        {
            goto end;
        }

        if (data_end < cursor + ext->len)
        {
            goto end;
        }

        cursor += ext->len;
    }

end:
    return XDP_PASS;
}

The limitation here is obvious, even if we have 31 extensions of just a few bytes, the 32'nd can never be larger than 2048 bytes. There might be a way to track the sum of all extensions so far and check that this sum never exceeds 65536, allowing us to get rid of these "worst-case-senario" constants and check the actual umax_value, but I will leave that as a research topic for someone else.