Caching

Caching is the technique of storing copies of frequently accessed data in a faster storage layer so that future requests can be served more quickly, reducing latency and load on the primary data source.

Caches exploit the principle of locality — recently or frequently accessed data is likely to be accessed again soon. By storing this data in a faster medium (typically in-memory), systems can serve requests orders of magnitude faster than reading from disk or making network calls.

Caching operates at multiple levels in a system: browser caches, CDN edge caches, application-level caches (e.g., Redis or Memcached), and database query caches. Each layer reduces the load on the layer behind it.

Key design decisions include cache invalidation strategy (write-through, write-behind, write-around, cache-aside), eviction policies (LRU, LFU, TTL-based), cache consistency (how stale is acceptable), and cold start behavior.

The famous adage "there are only two hard things in computer science: cache invalidation and naming things" reflects the real challenge — keeping cached data in sync with the source of truth.

Related Terms

Practice This Concept

Ready to design?