Why couldn't twitter scale by adding servers the way sites like facebook have?
I have been looking for an explanation for why twitter had to migrate part of its middle ware from Rails to Scala. What prevented them from scaling the way facebook has, by adding servers as its user base expanded. More specifically what about the Ruby/Rails technology prevented the twitter team from taking this approach?
Solution 1:
It's not that Rails doesn't scale, but rather, requests for "live" data in Ruby (or any interpreted language) do not scale, as they are comparatively far more expensive both in terms of CPU & memory utilization than their compiled language counterparts.
Now, were Twitter a different type of service, one that had the same enormous user base, but served data that changed less frequently, Rails could be a viable option via caching; i.e. avoiding live requests to the Rails stack entirely and offloading to front end server and/or in-memory DB cache. An excellent article on this topic:
How Basecamp Next got to be so damn fast
However, Twitter did not ditch Rails for scaling issues alone, they made the switch because Scala, as a language, provides certain built-in guarantees about the state of your application that interpreted languages cannot provide: if it compiles, time wasting bugs such as fat-fingered typos, incorrect method calls, incorrect type declarations, etc. simply cannot exist.
For Twitter TDD was not enough. A quote from Dijkstra in Programming in Scala illustrates this point: "testing can only prove the presence of errors, never their absence". As their application grew, they ran into more and more hard to track down bugs. The magical mystery tour was becoming a hindrance beyond performance, so they made the switch. By all accounts an overwhelming success, Twitter is to Scala what Facebook is to PHP (although Facebook uses their own ultra fast C++ preprocessor so cheating a bit ;-))
To sum up, Twitter made the switch for both performance and reliability. Of course, Rails tends to be on the innovation forefront, so the 99% non-Twitter level trafficked applications of the world can get by just fine with an interpreted language (although, I'm now solidly on the compiled language side of the fence, Scala is just too good!)
Solution 2:
No platform can infinitely scale out whilst still dealing with complex sets of data that change moment to moment. Language and infrastructure matters, but how you build your site and the data access patterns matter more.
If you've ever played games like Transport Tycoon or Settlers where you have to transport resources around, you'll know how you need to stay on top of upgrading infrastructure as usage increases.
Scaling platforms like Facebook and Twitter is a never-ending task. You have an ever increasing number of users, and you're being pushed to add more features and functionality. It's a continual process of upgrading one bit, which causes more stress on another bit.
Throwing servers at the problem isn't always the answer, and sometimes can cause more problems.
Solution 3:
http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster links to a set of posts about the changes, including a decent history of the steps taken over time.
The short version is that Ruby and Rails didn't deliver the performance and reliability they required for the service. Given the scale, this isn't surprising; most COTS solutions are not satisfactory at the super-large end of scale.
High Scalability covers a lot of questions about architecture at that top end, for other sites, so helps answer broader questions in the area too.