Lru Cache (Advanced Structures)

Definition and Overview

Concept

LRU Cache: caching strategy evicting least recently accessed elements on capacity overflow. Goal: maximize cache hit ratio by temporal locality assumption. Usage: memory management, database systems, web browsers, CPU caches.

Cache Function

Stores key-value pairs with bounded capacity. On insert beyond capacity, removes least recently used item. Access updates recency status, promoting item to most recently used.

Eviction Policy

Least Recently Used eviction: oldest accessed item removed first. Justification: recent use predicts future use better than older data.

"Caching is a fundamental method for improving system efficiency by exploiting temporal locality." -- Jeffrey Ullman

Core Principles

Temporal Locality

Assumes recently accessed data likely accessed again soon. Justifies prioritizing recent accesses in cache retention.

Capacity Constraint

Cache size limited. Eviction necessary when full. Algorithm must identify eviction candidate efficiently.

Recency Tracking

Maintains order of use. Updates on every access or insertion. Enables quick identification of least recently used item.

Consistency

Cache state must remain consistent after operations: lookups, insertions, evictions. Supports concurrency in multi-threaded environments with synchronization mechanisms.

Underlying Data Structures

Hash Map

Maps keys to nodes storing values and usage metadata. Provides O(1) average-time lookup and update.

Doubly Linked List

Maintains access order from most to least recent. Supports O(1) insertion, deletion, and move-to-front operations.

Node Structure

Contains key, value, pointers to prev and next nodes. Enables bidirectional traversal and fast removal.

Combined Use

Hash map provides node reference; linked list maintains order. Combination enables efficient LRU operations.

Data Structure	Purpose	Time Complexity
Hash Map	Key to node mapping	O(1) average
Doubly Linked List	Maintains usage order	O(1) insertion/deletion

LRU Algorithm Mechanics

Access Operation

Lookup key in hash map. If found, move associated node to front of linked list to mark recently used.

Insertion Operation

Check capacity. If full, remove tail node (least recently used) from list and hash map. Insert new node at front, update map.

Eviction Operation

Eviction triggered on capacity overflow. Removes node at tail of linked list. Updates hash map accordingly.

Update Operation

On cache hit, refresh position of node to front. On miss with new key, add new node and possibly evict.

Function get(key): if key not in cache: return -1 node = cache[key] move node to front of linked list return node.valueFunction put(key, value): if key in cache: update node.value move node to front else: if cache.size == capacity: remove tail node from list and cache create new node add node to front add key-node to cache

Common Implementations

Programming Languages

Implemented in Java (LinkedHashMap), C++ (std::list + std::unordered_map), Python (collections.OrderedDict, functools.lru_cache decorator).

Library Support

Standard libraries provide LRU or similar caches. Custom implementations allow tuning capacity, concurrency.

Thread Safety

Concurrent environments require locks or atomic operations to maintain consistency.

Customization

Variants include weighted caching, expiration policies, write-back mechanisms.

Time and Space Complexity

Lookup

O(1) average due to hash map direct access.

Insertion and Eviction

O(1) amortized by linked list node repositioning and map update.

Space Complexity

O(n) proportional to cache capacity n. Stores nodes and hash entries.

Operation	Time Complexity	Space Complexity
Lookup (get)	O(1) average	-
Insert (put)	O(1) amortized	-
Eviction	O(1)	-
Overall space	-	O(n)

Practical Use Cases

Operating Systems

Page replacement in virtual memory management using LRU to decide which pages to swap out.

Web Browsers

Cache recently visited pages or images to reduce latency and bandwidth consumption.

Databases

Buffer pool management using LRU to retain frequently accessed data blocks in memory.

Networking

Router cache for recently resolved IP addresses or routes.

Programming Libraries

Memoization caches, API response caching to optimize repeated calls.

Advantages and Limitations

Advantages

Simple and intuitive. Efficient O(1) operations. Balances recency and capacity well. Widely supported.

Limitations

Requires additional memory for data structures. Not optimal for all workloads (e.g., scan-resistant). Can suffer from cache pollution.

Workload Sensitivity

Performs poorly if access patterns do not exhibit temporal locality.

Concurrency Challenges

Synchronization overhead in multi-threaded contexts may reduce throughput.

Variants and Related Algorithms

LFU (Least Frequently Used)

Evicts least frequently accessed items instead of least recent.

ARC (Adaptive Replacement Cache)

Balances recency and frequency dynamically.

Clock Algorithm

Approximation of LRU with lower overhead.

Segmented LRU (SLRU)

Divides cache into segments for different recency classes.

2Q Algorithm

Uses two queues to separate new and frequent items.

Optimization Techniques

Data Structure Tuning

Use of specialized linked list implementations to reduce pointer overhead.

Lazy Eviction

Delay and batch removals to amortize cost.

Concurrency Control

Lock-free or fine-grained locking to minimize contention.

Cache Partitioning

Divide cache among threads or data types to improve locality.

Hardware Support

CPU cache mechanisms inspired by LRU principles with hardware counters.

Code Example

Python Implementation

class Node: def __init__(self, key, value): self.key = key self.value = value self.prev = None self.next = Noneclass LRUCache: def __init__(self, capacity): self.capacity = capacity self.cache = dict() self.head = Node(0, 0) self.tail = Node(0, 0) self.head.next = self.tail self.tail.prev = self.head def _remove(self, node): prev_node = node.prev next_node = node.next prev_node.next = next_node next_node.prev = prev_node def _add(self, node): node.next = self.head.next node.prev = self.head self.head.next.prev = node self.head.next = node def get(self, key): if key not in self.cache: return -1 node = self.cache[key] self._remove(node) self._add(node) return node.value def put(self, key, value): if key in self.cache: self._remove(self.cache[key]) node = Node(key, value) self._add(node) self.cache[key] = node if len(self.cache) > self.capacity: lru = self.tail.prev self._remove(lru) del self.cache[lru.key]

References

Mattson, R. L., Gecsei, J., Slutz, D. R., & Traiger, I. L. (1970). Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2), 78-117.
Denning, P. J. (1968). The working set model for program behavior. Communications of the ACM, 11(5), 323-333.
O'Neil, E. J., O'Neil, P. E., & Weikum, G. (1993). The LRU-K page replacement algorithm for database disk buffering. ACM SIGMOD Record, 22(2), 297-306.
Johnson, T. A. (1984). The adaptive replacement cache. ACM SIGMETRICS Performance Evaluation Review, 22(1), 1-10.
Qin, X., & Chanson, S. T. (2004). Clock-Pro: An effective improvement of the clock replacement. Proceedings of the 2004 USENIX Annual Technical Conference, 101-114.