HAProxy encode url for querystring
I have a system that needs to preserve the original referer over multiple redirects. To achieve this I'm trying to write the referer to the URL query string.
I already got that working (somewhat).
http-request set-query ref=%[req.hdr(Referer)]&%[query]
The only problem is that URLs in the querystring have to be encoded. Unfortunately HAProxy only has a url_dec function.
Is there any easy way in which I can encode the URL?
There doesn't appear to be a built-in for this, but it's easily done with the Lua integration in HAProxy 1.6 and later.
Create a Lua file, let's say /etc/haproxy/lua/url_escape.lua
.
I found some examples online of url-escaping ("encoding") in Lua, but none of them I found in a cursory search appeared to be UTF-8 aware. So, I wrote this:
function url_escape(str)
local escape_next = 0;
local escaped = str:gsub('.',function(char)
local ord = char:byte(1);
if(escape_next > 0) then
escape_next = escape_next - 1;
elseif(ord <= 127) then -- single-byte utf-8
if(char:match("[0-9a-zA-Z%-%._~]")) then -- only these do not get escaped
return char;
elseif char == ' ' then -- also space, becomes '+'
return '+';
end;
elseif(ord >= 192 and ord < 224) then -- utf-8 2-byte
escape_next = 1;
elseif(ord >= 224 and ord < 240) then -- utf-8 3-byte
escape_next = 2;
elseif(ord >= 240 and ord < 248) then -- utf-8 4-byte
escape_next = 3;
end;
return string.format('%%%02X',ord);
end);
return escaped;
end;
core.register_converters('url_escape',url_escape);
Configure HAProxy to load this, in the global
section of /etc/haproxy.cfg
:
global
lua-load /etc/haproxy/lua/url_escape.lua
Now, you have a converter called lua.url_escape
, that works like other converters -- it's added with a ,
to the end of the expression that provides its input.
http-request set-query ref=%[req.hdr(Referer),lua.url_escape]&%[query]
Test:
curl -v http://example.com/my-page.html?lol=cat -H 'Referer: http://example.org/cats/Shrödinger.html'
The request seen by the back-end:
GET /my-page.html?ref=http%3A%2F%2Fexample.org%2Fcats%2FShr%C3%B6dinger.html&lol=cat HTTP/1.1
User-Agent: curl/7.35.0
Host: example.com
Accept: */*
Referer: http://example.org/cats/Shrödinger.html
Shrödinger
is correctly escaped here with ö (U+00D6) having two bytes in Shr%C3%B6dinger
. 3 and 4 byte characters also appear to be correctly handled. Byte sequences with the high bit on that don't correspond to valid UTF-8 characters will also be escaped.
Note that if you are logging the query string with HAProxy, the modified query is never logged -- the original request is what makes it to the log output.
UPDATE:
There is a built-in converter now: https://www.haproxy.com/documentation/hapee/2-2r1/onepage/#url_enc
You can use it in an http-request
directive like so:
http-request redirect location https://example.com/login?r=%[url,url_enc()] if !{ req.hdr(x-consumer-username) -m found }
In this example, the originally requested URL is URL encoded and added to the querystring at key r
.
Original answer:
The HAProxy folks have a blog post section for this exact use-case:
https://www.haproxy.com/blog/5-ways-to-extend-haproxy-with-lua/#converters
Seems like something that should be built-in IMO, however.