Forwarding HTTP Request with Direct Server Return

I have servers spread across several data centers, each storing different files. I want users to be able to access the files on all servers through a single domain and have the individual servers return the files directly to the users.

The following shows a simple example: 1) The user's browser requests http://www.example.com/files/file1.zip 2) Request goes to server A, based on the DNS A record for example.com. 3) Server A analyzes the request and works out that /files/file1.zip is stored on server B. 4) Server A forwards the request to server B. 5) Server B returns file1.zip directly to the user without going through server A.

Note: steps 4 and 5 must be transparent to the user and cannot involve sending a redirect to the user as that would violate the requirement of a single domain.

From my research, what I want to achieve is called "Direct Server Return" and it is a common setup for load balancing. It is also sometimes called a half reverse proxy.

For step 4, it sounds like I need to do MAC Address Translation and then pass the request back onto the network and for servers outside the network of server A tunneling will be required.

For step 5, I simply need to configure server B, as per the real servers in a load balancing setup. Namely, server B should have server A's IP address on the loopback interface and it should not answer any ARP requests for that IP address.

My problem is how to actually achieve step 4?

I have found plenty of hardware and software that can do this for simple load balancing at layer 4, but these solutions fall short and cannot handle the kind of custom routing I require. It seems like I will need to roll my own solution.

Ideally, I would like to do the routing / forwarding at the web server level, i.e. in PHP or C# / ASP.net. However, I am open to doing it at a lower level such as Apache or IIS, or at an even lower level, i.e. a custom proxy service in front of everything.

Thanks.


Daniel - you could probably get this done with LVS and a clustered filesystem relatively easily. Your LVS load balancers would receive the HTTP requests and they would forward the requests to a web node that can respond to the request. The response would leave directly from the web node and the load balancer wouldn't be involved at all in the reply.

I put together a guide for this (on Fedora) that works well on virtualized Xen instances.