Need to DUPLICATE HTTP requests to two servers
Solution 1:
What we chose eventually was using Gor (now GorReplay) https://github.com/buger/goreplay
this solution allows installing a listener on the original host, and what it does is record any incoming HTTP request, this is done without modifying it or blocking the production server from handling it.
It then pushes these requests to a Gor replay server which can handle all kinds of useful logics of splitting/increasing load based on the incoming requests - you can send a percentage of requests to a dev server, or a multiplication of the requests to create simulated (but from real traffic) load on your staging environment, or both...
Sadly this is at the server level, so you have to install on each production server to get all the traffic, but you don't have to, and it provides a great solution for the problem outlaid in my question.
Solution 2:
Although is not what you are asking for, I will suggest another approach to test the new server.
If you put a load balancer in front of both servers and play with the load balancing algorithms you can at the same time test the new server and gradually replace the old one. You can send 99% of the requests to the old server and the remaining one percent of requests will go to the new one where you can closely review if the service is working as expected.
If everything works fine you can increase the load gradually; 90%-10%, 80%-20%, and so on.
Hint: Check haproxy and the weight
and static-rr
options.