AWS - how to serve faster to oversea users
We have our infras relying on AWS in the us-east-1 region. (EC2, CloudFront, RDS, ElastiCache)
We are now having more and more users from the APAC. Users start to complain about network speed to our website. (please note we are already using CloudFront for serving static assets)
Some clues after research:
- Clone a set of infras to an APAC region (eg. JP)
- $$ concern
- A fact found by a quick test: The latency between us-east-1 <---> ap-norteast-1 is around 160-180ms.
- Not really feasible in our case. Although we can create DB read replica in JP, web servers must still send write operations to US.
- ElastiCache does not support cross-region. ie. US ElastiCache is only accessible by US ec2 instances.
- A VPC in each region, interconnect both VPC's with IPSec/VPN tunnel. JP contains only web servers, all other services remain in US.
- Still, there is latency between US and JP
- Using WAN optimizer for the VPN tunnel in #2
- Anyone has experiences on this? I couldnt find much in google for VPC-to-VPC optimization...
- Using CloudFlare's Railgun
- We only need to install the Railgun listener in the US web servers
- Much much simpler, we even dont need anything running in JP
My questions:
- What is the best way/industry best practice? To scale up into another region? I know some companies have their infras in a single region only, but how do they ensure the speed for oversea users?
- For #2, does persistent tunnel help?
- For #2/#3, assuming the latency and network speed between regions can be optimized, is it really necessary to have web servers in JP? How about having only proxy servers in JP that proxy requests to US web servers?
Any help will be appreciated, thanks :D
Solution 1:
The world is a fairly large place and although network bandwidth is steadily increasing network latency between one part of the world and the opposite end of the globe isn't going away anytime soon.
Optimisation and tuning on multiple levels can improve the user experience, but ultimately you'll reach the level where the only feasible way to improve performance more is to reduce latency by having your data physically closer to your end-user.
A good book with many insights and the source of the graph above is High Performance Browser Networking by web performance engineer Ilya Grigorik.
What's most economical/optimal depends on your specific scenario, your code base and requires careful testing. There is no magic infrastructure only solution.
Most applications that need to scale massively go through one or multiple re-designs to deal with that. Design choices , technology and assumptions that seemed valid for X amount of users will prove to have been wrong for a 100 or a 1000 times that.
Interesting lessons learned are found on the High Scalability blog
Redesigning your application code so dynamic content can be cached better is one approach e.g. take a look at the varnish model which allows your web application to invalidate cached dynamic content on-demand which works really well when quite lot of dynamic content does not actually need to regenerated completely for each request. That should allow you to make better use of a CDN and means you can stay within a single availability region.
Redesigning your application so it will work over multiple availability zones will also improve diaster recovery and not only improve performance for international users.
Solution 2:
You have to tradeoff between user experience and dollars. First, I'd want to know how many users on a percentage basis are coming from APAC. If it is less than say 10%, your best course of action is to probably wait to see what happens.
You also don't say what type of application you are supporting and how latency sensitive it is. You'd make one decision if it was a real-time video chat application and another if it was an eventually consistent social media app.
All that said, you've found the right set of options.
I like your option 2 the best. I'd put as much of the proxy/web serving as possible as close to the most users as possible. Despite the fact that some traffic will always have to come back to your us-east-1 location, having the first connection terminate in-region will lead to a much better user experience. Think SSL roundtrips.
I'd also look at SPDY.
I'd also think about moving from us-east-1 to us-west-2 for your US presence.
The VPN tunnels are a good idea and not that hard to setup.
I'd setup redundant tunnels using OpenVPN.