Introduction

Key-value store: simplest NoSQL database. Data model: key -> value mapping. Extreme simplicity: only get, set, delete operations. Powers social media, caching, real-time analytics.

Advantages: ultra-fast access (O(1) hash lookup), horizontal scaling, high throughput, simple operations. Disadvantages: no complex queries, limited secondary access, eventual consistency common.

Foundation: hash tables. Single-node: classic hash table. Distributed: partition across servers using consistent hashing. Examples: Redis, Memcached, DynamoDB, Cassandra.

"Key-value stores trade expressive power for performance. Simplicity enables massive scalability. Foundation for modern web services, caching, real-time systems." -- NoSQL systems

Key-Value Data Model

Simplicity

Data: key-value pairs. Key: unique identifier (string, binary). Value: any data (string, number, JSON, blob). Schema-less: values can change type.

Operations

GET(key): retrieve value. SET(key, value): store value. DELETE(key): remove. Optional: increment, append, TTL (expiration). Minimal API.

Example

Keys: user:1001, user:1002, product:A1, product:B2
Values: {name: "Alice", age: 30}, {name: "Bob", age: 25}

GET(user:1001) -> {name: "Alice", age: 30}
SET(user:1001, {name: "Alice", age: 31})
DELETE(user:1001)

Value Types

Strings: simple values. Numbers: counters. Collections: lists, sets, hashes (Redis). Complex: JSON (often serialized as string).

Expiration (TTL)

Key expires after time T. Automatic cleanup. Useful: caching (auto-invalidate), sessions (auto-logout), rate limiting. Implementation: background cleanup.

Hash Tables Foundation

Hash Function

Map key to index: h(key) -> index [0, n-1]. Fast: O(1) average. Good function: uniform distribution, deterministic, fast to compute.

Array Storage

Store pairs in array. Array[h(key)] = value. Direct access: given key, compute hash, retrieve value immediately. Ultra-fast.

Load Factor

Ratio: entries / array size. High load: many collisions, slow. Low load: wasted space. Typical: keep < 0.75. Resize (rehash) if exceeding.

Resizing

When load factor high: allocate larger array, rehash all entries. O(n) cost, amortized O(1). Happens rarely, not noticed.

Complexity

Average: O(1) get/set/delete. Worst case: O(n) if hash poor or many collisions. Practice: excellent performance, rarely worst-case.

Collision Resolution Strategies

Separate Chaining

Each bucket: linked list of entries. Collision: append to list. Lookup: search list. Simple, but chains increase lookup time.

Open Addressing

Find another empty slot if collision. Linear probing: check next slot. Quadratic: check offset^2 slots. Double hashing: use second hash. Fewer memory allocations than chaining.

Linear Probing

Collision at slot i: try i+1, i+2, ... until empty. Simple. Risk: clustering (long runs of full slots). Slow searches.

Quadratic Probing

Try i+1^2, i+2^2, i+3^2, ... Reduces clustering. Better than linear. May not fill all slots (cycle).

Performance Impact

Good hash function: few collisions, near O(1). Poor function: many collisions, degraded. Choice: trade complexity vs. performance.

In-Memory Key-Value Stores

Examples

Redis: in-memory, persistent option, pub/sub. Memcached: pure in-memory cache. Fast: data in RAM, no disk I/O.

Speed Advantage

RAM 100x faster than disk. In-memory: microsecond latencies. Network latency dominates (milliseconds). Ultra-responsive.

Persistence Options

Durability: lose data on crash. Options: RDB (snapshots), AOF (append-only log), hybrid. Trade-off: speed vs. safety.

Memory Constraints

Data limited by RAM size. Scaling: cache subset of data. Eviction policies: LRU (least recently used), LFU (least frequently), TTL. Manage memory.

Use Cases

Caching: speed up databases. Sessions: user state. Counters: page views, metrics. Real-time: leaderboards. Pub/sub: messaging.

Persistent Key-Value Stores

Disk-Based

Data written to disk. Survives crashes. Slower than in-memory but durable. Examples: LevelDB, RocksDB, LSM trees.

Log-Structured Merge (LSM) Trees

Fast writes: write to memory first (memtable), later flush to disk (SSTables). Compaction: merge SSTables. Efficient for write-heavy workloads.

B-Trees

Traditional: balanced tree, ordered keys. Good for range queries. Disk-optimized: group keys in blocks. Examples: SQLite, LevelDB.

Durability Guarantees

Fsync: force write to disk. Slow but safe. No fsync: fast but risky. Batching: group writes, fsync periodically. Balance: performance vs. durability.

Performance Tuning

Compression: reduce disk space, trade CPU. Caching: RAM for frequently accessed data. Indexing: fast lookup. Optimization: crucial for performance.

Distributed Key-Value Stores

Scaling

Single server limited by RAM/CPU/disk. Distribute: partition keys across servers. Each server subset. Scale horizontally: add servers.

Consistent Hashing

Map key to server: h(key) mod N. Problem: adding server shifts all keys (rehashing). Consistent hashing: ring topology, add server shifts only nearby keys. Minimal disruption.

Replication

Copy data to multiple servers. Availability: if one fails, others serve. Consistency: eventual (writes may not immediately visible on all).

Quorum Reads/Writes

Write to W replicas, read from R replicas. W+R > N (replication factor): strong consistency. Otherwise: eventual. Tune: consistency vs. availability.

Examples

DynamoDB: AWS managed, distributed. Cassandra: distributed, eventual consistent. Riak: distributed, tunable consistency. All scale horizontally.

Hashing and Partitioning Strategies

Hash Functions

Properties: deterministic (same key -> same hash), uniform (all hashes equally likely), fast. Examples: MurmurHash, CityHash, SHA-1.

Range Partitioning

Partition by key range (a-m, n-z). Simple, but unbalanced (some ranges more data). Rebalancing complex.

Hash Partitioning

h(key) determines server. Balanced: uniform distribution. Rebalancing: consistent hashing reduces movement. Standard: hash partitioning.

Consistent Hashing

Ring: servers placed on circle. Key: find clockwise nearest server. Add server: affects only keys between previous and new. Minimal movement.

Virtual Nodes

Each server: multiple points on ring. Balances load: servers with more capacity get more virtual nodes. Flexibility: accommodate heterogeneous clusters.

Consistency and Replication

Replication Models

Master-slave: writes to master, replicate to slaves (read-only). Multi-master: any node accepts writes. Leaderless: quorum-based consistency.

Consistency Levels

Strong: all reads see latest write. Eventual: eventually consistent. Causal: preserve causality. Tune: consistency-availability trade-off.

Conflict Resolution

Concurrent writes diverge. Resolution: timestamp (last-write-wins), application-defined, vector clocks. Merge or pick.

Write Concerns

Unacknowledged: fast, risky. Acknowledged: one replica confirms. Majority: majority replicas confirm. Trade-off: speed vs. durability.

Consistency Tuning

Per-operation: choose consistency level. Strong: slower. Eventual: faster. Flexible: application requirements determine.

Performance Characteristics

Access Time

In-memory: microseconds. Disk-based: milliseconds. Network: milliseconds (remote). Overall: network dominates in distributed.

Throughput

Single-threaded: thousands ops/sec. Multi-threaded: millions. Distributed: scale linearly with servers. Network bandwidth: bottleneck at scale.

Latency vs. Throughput

Latency: per-operation time. Throughput: operations per second. Different goals. Batch operations: improve throughput, increase latency. Trade-off.

Benchmarking

Measure: operations/sec, latency (p50, p99, p99.9). Under load: latency increases. Monitor: identify bottlenecks. Tune: optimize critical paths.

Scaling Limits

Network bandwidth: eventually saturated. Storage per server: RAM/disk full. Consistency: harder with more servers. Plan: understand scaling limits.

Popular Implementations

Redis

In-memory, rich data structures (lists, sets, hashes), pub/sub, persistence options (RDB, AOF). Fast, versatile. Standard caching/session store.

Memcached

Pure cache: in-memory, no persistence. Simple, fast, distributed. LRU eviction. Web applications: speed up databases.

DynamoDB

AWS managed, distributed, eventual consistent by default. Pay-per-request. Global tables: multi-region. Enterprise standard.

Cassandra

Distributed, highly available, tunable consistency. LSM trees, eventual consistent. Large scale: proven at scale. Operational complexity.

RocksDB

Embedded, high performance, LSM trees. Used in DynamoDB, CockroachDB. Library: integrate into applications. Not standalone server.

Real-World Applications

Caching

Speed up databases: cache frequent queries. Reduces load, improves latency. Application-managed: decide what to cache, TTL. Critical for performance.

Sessions

User login state: store in key-value. Fast access, shared across servers. TTL auto-logout. Distributed: multi-region sessions.

Real-Time Analytics

Counters, metrics: update frequently. Key-value fast increments. Leaderboards: sorted sets. Real-time dashboards depend on key-value speed.

Message Queues

Pub/sub, queues: Redis, Kafka. Decouple systems, handle spikes. Scalable, low-latency. Event-driven architectures.

Data Locality

Hot data in cache, cold on disk. Tier-based: fast/expensive (in-memory) for hot, slow/cheap (disk) for cold. Cost optimization.

References

  • DeCandia, G., et al. "Dynamo: Amazon's Highly Available Key-Value Store." Proceedings of SOSP, 2007.
  • Lakshman, A., and Malik, P. "Cassandra: A Decentralized Structured Storage System." Proceedings of ACM SIGOPS, 2010.
  • Dong, S., et al. "RocksDB: a High-Performance Embedded Database." 2013.
  • Kleppmann, M. "Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems." O'Reilly Media, 2017.
  • Karger, D., et al. "Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web." Proceedings of ACM STOC, 1997.