Design a Cache System
Design a distributed caching layer for a high-traffic web application. Consider multi-tier caching, write-through vs write-back strategies, cache coherence, and eviction policies.
You'll practice
Functional Requirements
- Store and retrieve key-value data from cache
- Evict entries when cache is full
- Support TTL-based expiration
Non-Functional Requirements
- Sub-millisecond read latency
- Handle cache node failures gracefully
- Scale to millions of operations per second
Frequently Asked Questions
What is the difference between write-through and write-back caching?
Write-through writes data to both the cache and the backing store simultaneously, ensuring consistency but adding write latency. Write-back (write-behind) writes only to the cache initially and asynchronously flushes to the backing store, offering better write performance but risking data loss if the cache fails before flushing.
What cache eviction policies should I consider?
LRU (Least Recently Used) evicts the oldest-accessed item and works well for most workloads. LFU (Least Frequently Used) evicts the least-accessed item, better for skewed access patterns. TTL-based eviction removes items after a fixed time. Most systems combine TTL with LRU for best results.
How do you prevent cache stampedes?
Cache stampedes occur when many requests simultaneously miss the cache for the same key. Solutions include request coalescing (only one request fetches from the source while others wait), probabilistic early expiration (randomly refreshing before TTL expires), and lock-based approaches where the first requester acquires a lock to rebuild the cache.
What is cache warming and when should you use it?
Cache warming is pre-loading frequently accessed data into the cache before it receives real traffic. Use it after deployments, cache restarts, or when launching new features. It prevents the cold-start problem where initial requests all miss the cache and overwhelm the backing store.
Ready to design this system?