How can I use a static IP address with an Application Load Balancer in a highly available manner?

I'm overseeing an integration for a customer and their vendor needs a number of IP Addresses to whitelist. The originating server(s) is an Elastic Beanstalk Instance fronted by an Application Load Balancer with all the trimmings via Route53.

That's not going to work, since you can't assign a static IP to Application Load Balancers by definition (and I do need the Layer 7 features).

I can't just proxy the specific requests to the vendor via code and an additional EC2 instance because it's a two-way integration.

I've read through this article, but that frankly seems like a hack and not something I'd do in a production environment.

It certainly seems like I need some combination of an NLB with an ALB, but again, the references article introduces a ton of moving parts.

Edit:

  • I am using a VPC
  • The instance itself lives in a private subnet
  • The ALB lives in a public subnet and both have the necessary routing to communicate
  • I'm almost certain I'm talking in circles trying to convince myself Static IP !== Single Point of Failure

There is currently only one way to associate static IP addresses with Application Load Balancer (ALB) -- AWS Global Accelerator.

Static Anycast IPs – Global Accelerator uses Static IP addresses that serve as a fixed entry point to your applications hosted in any number of AWS Regions. These IP addresses are Anycast from AWS edge locations, meaning that these IP addresses are announced from multiple AWS edge locations, enabling traffic to ingress onto the AWS global network as close to your users as possible. You can associate these addresses to regional AWS resources or endpoints, such as Network Load Balancers, Application Load Balancers, and Elastic IP addresses. You don’t need to make any client-facing changes or update DNS records as you modify or replace endpoints.

https://aws.amazon.com/blogs/aws/new-aws-global-accelerator-for-availability-and-performance/

Global Accelerator allocates two static IPs from two Network Zones¹, and these are unique to your deployment -- not shared. These are advertised out to the Internet via peering connections at multiple locations on the AWS Edge Network (the same network where CloudFront, Route 53, and S3 Transfer Acceleration all operate -- it has more points of presence than just the AWS regions, and AWS-managed fiber connections to the regions). Then you associate the endpoints -- ALB, NLB, EIP, or EC2 Instance (without EIP) -- with the Global Accelerator instance, and traffic from the edge location where the requests arrive is NAT-ed to your balancer.

When Global Accelerator was initially launched, it relied on Source NAT to tie the global addresses to the VPC devices, so you couldn't use the client source IP or the X-Forwarded-For header from the ALB to determine the client IP address in real-time; however, that has changed -- X-Forwarded-For now correctly identifies the client IP address when an ALB is used with Global Accelerator in most AWS regions.

Client IP address preservation only works when the endpoint is an ALB or an EC2 instance (without EIP). It doesn't work with EIP endpoints or Network Load Balancers; for those cases you can only cross-correlate them later using flow logs, which capture source/destination tuples as well as the intermediate NAT address that your application will see.


Importantly, ALB is inbound only (connections are only ever established from outside to inside, regardless of the ultimate direction of data transfer), so if your servers are also initiating connections, you need a separate solution for a static source address -- a NAT Gateway.

One NAT Gateway per availability zone, placed on a public subnet, can serve as the default gateway for one or more private subnets within the availability zone, so that all instances on those subnets use the same source IP when contacting the Internet. NAT Gateway is not a black box in a physical place -- it's a feature of the network infrastructure, so it's intrinsically fail-safe and not considered a single point of failure within a single AZ. You can share a single NAT Gateway across availability zones, but then you do have a single point of failure if something catastrophic occurs in that one availability zone (and you'll pay slightly more to transport Internet traffic across AZ boundaries, compared to placing one NAT Gateway in each AZ). The NAT Gateway requires no application changes, because it isn't a proxy -- it's a network address translator that's transparent to the instances that are located on the subnets that are configured to use it. Each NAT Gateway has a static EIP.


¹ network zone is new AWS terminology, introduced with Global Accelerator. It describes the fact that the two IP addresses are, internally, handled by different infrastructure.


This is frustrating because there is no good way to do this especially if you are serving a website and want to know who is coming to your site,

Option 1: AWS Global Accelerator - which works but strips the client IP so you have no accounting of where people are coming from.

Option 2: This really obscure method : https://aws.amazon.com/blogs/networking-and-content-delivery/using-static-ip-addresses-for-application-load-balancers/ which in short requires an ELB and an ALB and so forth which breaks some functionality in the process and requires a heavy handed approach.

As of 5/2019 these are the main ways to do it.