Horizontal Scaling
ScalingAlso known as: scaling out, scale out
Horizontal scaling (scaling out) means adding more machines to a system to handle increased load, distributing work across multiple servers rather than upgrading a single machine.
Horizontal scaling contrasts with vertical scaling (scaling up), which means adding more CPU, RAM, or storage to an existing machine. Horizontal scaling is generally preferred in distributed systems because it has no theoretical upper limit and provides redundancy.
For horizontal scaling to work, application state must be externalized — if any server can handle any request, servers cannot store session data locally. This drives architectural patterns like stateless services, shared caches (Redis), and distributed databases.
Horizontal scaling applies at every layer: web servers behind a load balancer, read replicas for databases, multiple workers consuming from a message queue, and distributed cache clusters.
Key considerations include data partitioning strategy (how to split data across nodes), consistency models, network overhead (more nodes means more inter-node communication), and the operational complexity of managing a larger fleet of machines.
Related Terms
Ready to design?
Practice using horizontal scaling in a real system design on Supaboard's interactive whiteboard.
Browse Challenges