What types of systems have to “scale up” rather than “scale out”?

I primarily work with an application that has zero horizontal scaling potential. Even though it runs on Linux, the application, data structures and I/O requirements force me to "scale up" onto progressively larger systems in order to accommodate increased user workloads.

Many legacy line-of-business and transactional applications have these types of constraints. It's one reason I stress that the industry focus on cloud solutions and DevOps-driven web-scale architectures ignores a good percentage of the computing world.

Unfortunately, the scale-up systems I describe are really unsexy, so the industry tends to ignore their value or deemphasize the skills needed to address large, critical systems (e.g. cattle versus pets).


From a developer perspective I can say that nearly every traditional mainstream database engine out there can only scale up and scaling out is very much an after thought.

In recent years with the need for greater scalability and highly available systems there have been efforts to make existing databases scale out. But because the designs are hindered by legacy code, it's very much just bolted on rather than fundamental to the design. You'll encounter this if try to scale most of the well known database engines. Adding slave servers can be quite difficult to set up and you'll notice that it comes with significant limitations, some of which may require re-jigging your database tables.

For example, most of them are master/(multi-)slave rather than multi-master designs. In other words, you might just have an entire server just sitting there and not able to process queries. Some do, but with limitations... e.g. read only multi-slave design. So you might have one server that takes writes and all the others provide read-only data. You'll notice when you set these systems up it's not always a straight forward process and difficult to get working well. It feels very much a bolt on addition in many cases.

On the other hand, there are some newer database engines being developed with concurrency and multi-master design from the beginning. NOSQL and NewSQL are the new design class.

So it would seem the best way to get better performance from a traditional SQL server is scale up! While with NOSQL & NewSQL it's both scale up & scale out.

The reason traditional RDBMS systems are tightly coupled is because they all need a consistent view of the same data. When you have multiple servers accepting updates to the same data from different clients, which one do you trust? Any method that attempts to ensure that the data is consistent through some sort of locking mechanism requires cooperation from other servers that either hurts performance or affects data quality in that any data read from a client might be out of date. And the servers need to decide among themselves which data is most recent when writing to the same record. As you can see it's a complex problem made more complex by the fact that the workload is spread across servers and not just among processes or threads where access to the data is still quite fast.