Nginx will not start with host not found in upstream
I use nginx to proxy and hold persistent connections to far away servers for me.
I have configured about 15 blocks similar to this example:
upstream rinu-test {
server test.rinu.test:443;
keepalive 20;
}
server {
listen 80;
server_name test.rinu.test;
location / {
proxy_pass https://rinu-test;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $http_host;
}
}
The problem is if the hostname can not be resolved in one or more of the upstream
blocks, nginx will not (re)start. I can't use static IPs either, some of these hosts explicitly said not to do that because IPs will change. Every other solution I've seen to this error message says to get rid of upstream
and do everything in the location
block. That it not possible here because keepalive
is only available under upstream
.
I can temporarily afford to lose one server but not all 15.
Edit: Turns out nginx is not suitable for this use case. An alternative backend (upstream) keepalive proxy should be used. A custom Node.js alternative is in my answer. So far I haven't found any other alternatives that actually work.
Earlier versions of nginx (before 1.1.4), which already powered a huge number of the most visited websites worldwide (and some still do even nowdays, if the server headers are to be believed), didn't even support keepalive
on the upstream
side, because there is very little benefit for doing so in the datacentre setting, unless you have a non-trivial latency between your various hosts; see https://serverfault.com/a/883019/110020 for some explanation.
Basically, unless you know you specifically need keepalive between your upstream and front-end, chances are it's only making your architecture less resilient and worse-off.
(Note that your current solution is also wrong because a change in the IP address will likewise go undetected, because you're doing hostname resolution at config reload only; so, even if nginx does start, it'll basically stop working once IP addresses of the upstream servers do change.)
Potential solutions, pick one:
The best solution would seem to just get rid of
upstream
keepalive
as likely unnecessary in a datacentre environment, and use variables withproxy_pass
for up-to-date DNS resolution for each request (nginx is still smart-enough to still do the caching of such resolutions)Another option would be to get a paid version of nginx through a commercial subscription, which has a
resolve
parameter for theserver
directive within theupstream
context.Finally, another thing to try might be to use a
set
variable and/or amap
to specify the servers withinupstream
; this is neither confirmed nor denied to have been implemented; e.g., it may or may not work.
Your scenario is very similar to the one when using aws ELB
as uptreams in where is critical to resolve
the proper IP of the defined domain.
The first thing you need to do and ensure is that the DNS servers you are using can resolve to your domains, then you could create your config like this:
resolver 10.0.0.2 valid=300s;
resolver_timeout 10s;
location /foo {
set $foo_backend_servers foo_backends.example.com;
proxy_pass http://$foo_backend_servers;
}
location /bar {
set $bar_backend_servers bar_backends.example.com;
proxy_pass http://$bar_backend_servers;
}
Notice the resolver 10.0.0.2
it should be IP of the DNS server that works and answer your queries, depending on your setup this could be a local cache service like unbound. and then just use resolve 127.0.0.1
Now, is very important to use a variable to specify the domain name, from the docs:
When you use a variable to specify the domain name in the proxy_pass directive, NGINX re‑resolves the domain name when its TTL expires.
You could check your resolver by using tools like dig
for example:
$ dig +short stackoverflow.com
In case is a must to use keepalive
in the upstreams, and if is not an option to use Nginx +, then you could give a try to openresty balancer, you will need to use/implement lua-resty-dns