Vertical Scaling

Scaling

Also known as: scaling up, scale up

Vertical scaling (scaling up) means increasing the resources of a single machine — adding more CPU, RAM, or storage — to handle increased load, as opposed to adding more machines.

Vertical scaling is the simplest scaling approach: when your server is overloaded, upgrade it to a bigger one. It requires no changes to application architecture — a single-server application works the same on a small instance as on a powerful one.

Advantages include simplicity (no distributed system complexity), strong consistency (single machine, no replication lag), and lower operational overhead (one server to monitor and maintain).

Limitations are significant: there is a hardware ceiling (you cannot add infinite CPU/RAM to one machine), it creates a single point of failure (that one big server goes down, everything goes down), and costs increase non-linearly (doubling resources often more than doubles the price).

In system design interviews, vertical scaling is appropriate for small-to-medium systems or as a short-term solution. For large-scale systems, horizontal scaling is almost always required. Many real-world architectures use both: vertically scale individual nodes while horizontally scaling the fleet.

Related Terms

Ready to design?

Practice using vertical scaling in a real system design on Supaboard's interactive whiteboard.

Browse Challenges