Consistent Hashing
ScalingConsistent hashing is a distributed hashing technique that minimizes the number of keys that need to be remapped when the number of nodes in a system changes, making it ideal for distributed caches and databases.
In traditional hashing (key mod N), adding or removing a node requires remapping nearly all keys. Consistent hashing solves this by mapping both keys and nodes onto a circular hash ring. Each key is assigned to the first node encountered when walking clockwise from the key's position on the ring.
When a node is added, only the keys between the new node and its predecessor need to move. When a node is removed, only its keys redistribute to the next node. This means on average only K/N keys need to move (where K is total keys and N is total nodes).
Virtual nodes (vnodes) improve the distribution by assigning multiple positions on the ring to each physical node, reducing hotspots caused by uneven hash distributions.
Consistent hashing is used extensively in distributed systems including Amazon DynamoDB, Apache Cassandra, Memcached, and content delivery networks.
Related Terms
Ready to design?
Practice using consistent hashing in a real system design on Supaboard's interactive whiteboard.
Browse Challenges