How to encode url in haproxy [duplicate]

I have a system that needs to preserve the original referer over multiple redirects. To achieve this I'm trying to write the referer to the URL query string.
I already got that working (somewhat).

http-request set-query ref=%[req.hdr(Referer)]&%[query]

The only problem is that URLs in the querystring have to be encoded. Unfortunately HAProxy only has a url_dec function.

Is there any easy way in which I can encode the URL?


Solution 1:

There doesn't appear to be a built-in for this, but it's easily done with the Lua integration in HAProxy 1.6 and later.

Create a Lua file, let's say /etc/haproxy/lua/url_escape.lua.

I found some examples online of url-escaping ("encoding") in Lua, but none of them I found in a cursory search appeared to be UTF-8 aware. So, I wrote this:

function url_escape(str)
    local escape_next = 0;
    local escaped = str:gsub('.',function(char)
        local ord = char:byte(1);
        if(escape_next > 0) then
            escape_next = escape_next - 1;
        elseif(ord <= 127) then               -- single-byte utf-8
            if(char:match("[0-9a-zA-Z%-%._~]")) then -- only these do not get escaped
                return char;
            elseif char == ' ' then           -- also space, becomes '+'
                return '+';
            end;
        elseif(ord >= 192 and ord < 224) then -- utf-8 2-byte
            escape_next = 1;
        elseif(ord >= 224 and ord < 240) then -- utf-8 3-byte
            escape_next = 2;
        elseif(ord >= 240 and ord < 248) then -- utf-8 4-byte
            escape_next = 3;
        end;
        return string.format('%%%02X',ord);
    end);
    return escaped;
end;

core.register_converters('url_escape',url_escape);

Configure HAProxy to load this, in the global section of /etc/haproxy.cfg:

global
    lua-load /etc/haproxy/lua/url_escape.lua

Now, you have a converter called lua.url_escape, that works like other converters -- it's added with a , to the end of the expression that provides its input.

http-request set-query ref=%[req.hdr(Referer),lua.url_escape]&%[query]

Test:

curl -v http://example.com/my-page.html?lol=cat -H 'Referer: http://example.org/cats/Shrödinger.html'

The request seen by the back-end:

GET /my-page.html?ref=http%3A%2F%2Fexample.org%2Fcats%2FShr%C3%B6dinger.html&lol=cat HTTP/1.1
User-Agent: curl/7.35.0
Host: example.com
Accept: */*
Referer: http://example.org/cats/Shrödinger.html

Shrödinger is correctly escaped here with ö (U+00D6) having two bytes in Shr%C3%B6dinger. 3 and 4 byte characters also appear to be correctly handled. Byte sequences with the high bit on that don't correspond to valid UTF-8 characters will also be escaped.

Note that if you are logging the query string with HAProxy, the modified query is never logged -- the original request is what makes it to the log output.

Solution 2:

UPDATE:

There is a built-in converter now: https://www.haproxy.com/documentation/hapee/2-2r1/onepage/#url_enc

You can use it in an http-request directive like so:

http-request redirect location https://example.com/login?r=%[url,url_enc()] if !{ req.hdr(x-consumer-username) -m found }

In this example, the originally requested URL is URL encoded and added to the querystring at key r.

Original answer:

The HAProxy folks have a blog post section for this exact use-case:

https://www.haproxy.com/blog/5-ways-to-extend-haproxy-with-lua/#converters

Seems like something that should be built-in IMO, however.