Using Nginx real_ip when you don't know the intermediate proxy IP addresses

Nginx's real_ip module allows you to set the $remote_addr variable based on values sent in particular header fields. It has a special understanding of the X-Forwarded-For header, and is able to use the right-most untrusted value in the header as the connecting IP address.

I'd like to use the real_ip module to set $remote_addr to the connecting IP address. The problem I have is that I know how many hops back from the end of the X-Forwarded-For to look at, but not the IP address of the proxies in between. As I understand it, that means that I can't use set_real_ip_from to specify the proxy's IP address.

What I would like to be able to do is to configure nginx to choose the second to last address in the list as the $remote_addr. It seems as if the real_ip module only works if you have infrastructure where you know the IP addresses of your proxies.

Is there a way to do this with the real_ip module? I've worked up a regex based solution, but I would prefer to use the real_ip module if it is possible.

I don't think this is a dupe of nginx real_ip_header and X-Forwarded-For seems wrong or similar questions. To restate the issue:

  • I know how many hops from the end the connecting IP address will be.
  • I don't know the trusted IP addresses for intermediary proxies between the connecting IP and my server, so I can't use set_real_ip_from.

More details on the specifics:

I am running nginx inside Google Cloud, behind the Google Cloud HTTP Load Balancer. Google Cloud uses the X-Forwarded-For header to indicate the entry point to Google Cloud's network. I know that the second to last value in the X-Forwarded-For list is the one that I want, but I don't know what the IP address of the last value (the proxy) will be. Even if I could enumerate all of Google Cloud's address space for proxies (it's not specified that GCLB only operates inside GCP's address space), that would open it up for any other users that can get a server inside that address space.


Solution 1:

I ended up using a regex based version. From my nginx config:

http {
    # Regexes are:
    # (?<connecting_ip>\d+\.\d+\.\d+\.\d+), (?<proxy_ip>\d+\.\d+\.\d+\.\d+)$ # IPv4 only
    # (?<connecting_ip_x>[0-9a-f:.]+),\s*(?<proxy_ip>[0-9a-f:.]+)$ # IPv6 and IPv4, and more robust
    #
    # The last IP address is the one from the GCP front end load balancer
    # The second to last IP address in the list is the connecting IP address (i.e. user IP address)
    # We capture both of them. X-Forwarded-For is separated by commas, hopefully whitespace as well
    # but we don't want to trust that too much.
    #
    # Note that ~ at the start of the string in Nginx marks it as a regex. It's not part
    # of the regex.
    #
    # Test cases for regex101
    # 1.1.1.1, 2.2.2.2
    # 1.1.1.1, 2.2.2.2, 3.3.3.3
    # 1.1.1.1
    # ::ffff:130.211.1.102, 2.2.2.2
    # 2001:0db8:85a3:0000:0000:8a2e:0370:7334, 2.2.2.2
    # 2001:41d0:8:e8ad::1, 2600:1901:0:2ad2::
    # 1.1.1.1,2.2.2.2,3.3.3.3
    #
    # It would be better to use the real_ip module, if that is possible
    # https://serverfault.com/q/947835/334330 might get answered for this.


    # Get the IP address of the connecting IP. If we get a direct connection from
    # GCP's health checkers, there won't be an X-Forwarded-For header. We shouldn't
    # be getting any direct connections from other sources without XFF header.
    map $http_x_forwarded_for $connecting_ip {
        # Capture the proxy IP and connecting_ip_x, then assign the connecting_ip_x
        "~(?<connecting_ip_x>[0-9a-f:.]+),\s*(?<proxy_ip>[0-9a-f:.]+)$" $connecting_ip_x;
        default               $remote_addr;
    }
  # ...
}

and then I use $connecting_ip in my server definition:

server {
  # ...
  location / {
    proxy_set_header X-Real-IP $connecting_ip;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $http_host;
    proxy_set_header X-Forwarded-Host "";
    proxy_redirect off;
    proxy_next_upstream error;
  }
}