Introduction

Locking: mechanism ensuring concurrent access safety. Transactions acquire locks before access. Locks prevent: dirty reads, lost updates, race conditions. Cost: blocking (reduced concurrency). Balance: correctness vs. performance.

Core idea: mutual exclusion. Reader lock: multiple allowed. Writer lock: exclusive. Conflicts: prevent simultaneous access. Deadlock: potential risk (circular waiting). Management: system responsibility.

Traditional approach: databases use locking. Alternative: MVCC (multi-version concurrency control, modern). Choice: trade-off (simplicity vs. performance).

"Locking ensures isolation: prevents interference between concurrent transactions. Simple but can block: potential deadlock, reduced throughput. Modern systems: prefer MVCC but locking still essential." -- Concurrency control

Lock Types

Shared Lock (Read Lock)

Multiple transactions can hold simultaneously. Reading: non-conflicting. Prevents: dirty reads. Blocks: exclusive locks. Released: after read complete.

Exclusive Lock (Write Lock)

Only one transaction: holds exclusively. Writing: sole access. Blocks: all other locks (shared, exclusive). Released: after write complete (typically at commit).

Lock Conflict

Shared + Shared: compatible (both proceed). Shared + Exclusive: conflict (one waits). Exclusive + Exclusive: conflict (one waits). Resolution: queuing (first come, first served).

Wait Behavior

Request conflicting lock: transaction waits (blocks). Blocked: until lock released. Waiting queue: transactions queued fairly. Starvation prevention: careful ordering.

Lock Scope

Row-level: fine-grained (more concurrency). Table-level: coarse-grained (simpler). Mixed: adaptive (depends on selectivity). Choice: trade-off between granularity and overhead.

Shared Locks

Semantics

Multiple readers: each holds shared lock. Writers: blocked (cannot acquire exclusive). Allows: high concurrency for reads. Cost: none (shared compatible).

Example

Transaction A (Reader): SELECT * FROM accounts WHERE id=1
 - Acquires shared lock on row

Transaction B (Reader): SELECT * FROM accounts WHERE id=1
 - Also acquires shared lock (compatible)
 - Both read simultaneously

Transaction C (Writer): UPDATE accounts SET balance=100 WHERE id=1
 - Waits for shared locks to release (blocked)

Read-Heavy Workloads

Multiple readers: shared locks excellent. Low contention: minimal blocking. Write: eventually gets lock (after readers done). Throughput: high for read-heavy patterns.

Nested Reads

Multiple levels (row -> page -> table): all shared. Accumulation: stack of locks. Release: bottom-up (careful ordering). Complexity: transaction must manage.

Promotion Issue

Transaction: starts with shared, wants to upgrade to exclusive. Conflicts: other readers holding shared. Deadlock risk: all waiters (upgrade conflict). Solution: careful protocol.

Exclusive Locks

Semantics

Single writer: sole access. All other transactions: blocked. Prevents: lost updates, dirty reads. Cost: blocking (reduced concurrency). Necessary: safety (correctness first).

Example

Transaction A (Writer): UPDATE accounts SET balance=100 WHERE id=1
 - Acquires exclusive lock on row
 - All other transactions blocked

Transaction B (Reader): SELECT * FROM accounts WHERE id=1
 - Waits (blocked by exclusive lock)

Transaction C (Writer): UPDATE accounts SET balance=200 WHERE id=1
 - Waits (blocked by exclusive lock)

(All wait until A commits, then B and C proceed)

Write-Heavy Workloads

Many writers: heavy contention. Blocking: significant (reduced throughput). Queue formation: writers wait. Bottleneck: serialization (loses parallelism).

Duration

Exclusive lock: held until commit (durability requirement). Cannot release early: dirty read risk. Duration: critical (impacts other transactions). Optimize: keep transactions short.

Hold Time

Long-running transaction: holds locks long. Other transactions: blocked (waiting). Throughput: reduced. Practice: minimize transaction duration (do work outside transaction).

Lock Compatibility Matrix

Held Lock Shared Requested Exclusive Requested
No Lock Granted Granted
Shared Lock Granted Blocked (wait)
Exclusive Lock Blocked (wait) Blocked (wait)

Interpretation

Multiple shared: always compatible (readers don't conflict). Exclusive: blocks all (sole access). Reading after writing: blocked until writer releases. Writing after reading: blocked until readers done.

Fairness

Queue: FIFO (fair). Exclusive requests: sometimes prioritized (prevent starvation). Systems vary: check implementation.

Lock Granularity

Granularity Levels

Database level: all tables (coarse). Table level: entire table. Page level: block of rows. Row level: single row (fine). Column level: rare (expensive).

Trade-offs

Coarse: fewer locks, less overhead, more blocking. Fine: more locks, higher overhead, better concurrency. Choice: workload-dependent. Sweet spot: row-level (balance).

Escalation

Start: row-level (many locks). Escalate: too many locks, promote to table (fewer locks). Automatic: system escalates. Risk: escalation reduces concurrency (more blocking).

Hierarchical Locking

Database > Table > Page > Row. Intention locks: announce intent at higher level (reading row needs intent-read on table). Prevents: conflicting higher-level locks.

Intent Lock Modes

Intent-Shared (IS): plan to read. Intent-Exclusive (IX): plan to write. Shared-Intent-Exclusive (SIX): read at this level, write below. Complex but necessary: prevent conflicts at wrong level.

Practical Implementation

Row-level common: good balance. Table-level: simple (less granular). Escalation: automatic (system decides). Monitoring: detect excessive locks.

Lock Modes and Variants

Pessimistic vs. Optimistic

Pessimistic: acquire lock before access (assume conflicts). Optimistic: detect conflicts at commit (assume no conflicts). Pessimistic: simpler, blocking. Optimistic: better concurrency, complex.

Implicit vs. Explicit

Implicit: system acquires automatically (transparent). Explicit: application requests (control). Implicit: easier. Explicit: fine-tuning possible.

Predicate Locks

Lock condition: WHERE age > 30. Prevents: phantom reads (new rows matching inserted). Complex: expensive. Rare: few systems implement.

Gap Locks

Lock gap between rows: prevent insertion. Prevents: phantom reads (selective). Range locks: equivalent. Modern: common in MVCC systems (less needed).

Next-Key Locks

Lock record + gap after: prevents phantoms within range. MySQL InnoDB: uses this. Combines: record lock + gap lock.

Deadlock Concepts

Definition

Circular wait: transactions waiting for each other. Example: A waits for B, B waits for A. Unresolvable: both blocked forever (unless detected).

Deadlock Scenario

Transaction A: Lock row 1 (exclusive), waits for row 2
Transaction B: Lock row 2 (exclusive), waits for row 1

Circular dependency: neither can proceed
Deadlock: system must detect and resolve

Detection

Waits-for graph: transactions are nodes, edges=waits-for. Cycle: deadlock detected. Periodically checked: typically every few seconds. Cost: cycle detection computation.

Resolution

Victim selection: one transaction aborted (rolled back). Victim released: locks released, other continues. Victim retry: application retries (transparent usually). Fair: avoid repeated victim selection.

Timeout Approach

Transaction times out: after waiting duration. Automatically aborted (rolled back). Simple: no cycle detection. Risk: timeout too short (false aborts), too long (unresponsive).

Prevention

Lock ordering: acquire locks in consistent order. No cycles: prevention. Cost: application discipline. Practical: difficult to enforce globally.

Lock Protocols

Two-Phase Locking (2PL)

Phase 1 (Growing): acquire locks, no releases. Phase 2 (Shrinking): release locks, no acquires. Ensures: serializability (correctness). Lock point: maximum locks held (dividing point).

Strict 2PL

Exclusive locks: held until commit (not released earlier). Prevents: dirty reads, cascading rollbacks. Standard: SQL databases (usually strict).

Conservative 2PL

Acquire all locks upfront (predeclare). No blocking after: proceeds or aborts. Prevents: deadlock (no cycles). Cost: over-locking (more conflicts), predeclare overhead.

Advantages

Serializability: guaranteed. Isolation: complete. Simplicity: straightforward protocol.

Disadvantages

Blocking: potential (reduces concurrency). Deadlock: possible (requires resolution). Cascading rollback: possible (without strict variant).

Practical Implementation

SQL databases: default to strict 2PL (usually automatic, transparent). Lock acquisition: implicit with data access. Release: at transaction boundary (commit/rollback).

Performance Impact

Lock Overhead

Lock acquisition: CPU cost (hash table lookup, queue management). Lock held: memory cost (data structure). Contention: blocking (wait cost). Total: depends on workload.

Throughput vs. Latency

Locking serializes: reduces throughput (fewer parallel). Latency: may increase (wait times). Trade: serialization safety for performance. Optimization: minimize lock hold time.

Contention Scenarios

Hot rows: frequently accessed. High lock contention: many transactions waiting. Bottleneck: serialization (loss of parallelism). Solution: denormalize, cache, or MVCC.

Optimization Strategies

Shorter transactions: minimize lock hold time. Fewer locks: access less data. Lower granularity: prevent unnecessary blocking. Indexes: reduce rows scanned.

Monitoring

Lock wait time: measure delays. Lock holder duration: identify slow transactions. Deadlock frequency: indicate protocol issues. Tuning: based on metrics.

Optimizations and Alternatives

Multi-Version Concurrency Control (MVCC)

Snapshot: reader sees version at start time. Writer: creates new version. No lock: high concurrency. Cost: version storage, cleanup (garbage collection).

Lock-Free Structures

Compare-and-swap: atomic operations. No locks: true parallelism. Cost: complexity (CAS retry loops). Limited use: simple data structures.

Partitioning

Divide data: separate locks per partition. Reduce contention: spreads load. Trade: query complexity (multi-partition queries). Effective: hot-spot mitigation.

Caching

Reduce database access: cache in memory. Fewer locks: fewer database queries. Simplification: lower concurrency pressure. Consistency: cache coherence (complex).

Read Replicas

Read from replicas: distribute load. Write: single source (consistent). Latency: eventual (replication lag). Benefit: read scalability.

When to Use Alternatives

Locking: sufficient for moderate contention. MVCC: high read/write mix. Lock-free: specialized (advanced). Partitioning: hot-spot mitigation. Caching: reduce load. Choose: based on analysis.

Practical Considerations

Application Design

Keep transactions short: minimize lock hold time. Access small data: reduce lock scope. Predictable order: prevent deadlock. Error handling: retry on deadlock.

SQL Examples

SELECT * FROM accounts WHERE id=1 FOR UPDATE;
(Explicit exclusive lock acquisition, MySQL/PostgreSQL)

BEGIN TRANSACTION;
 UPDATE accounts SET balance=balance-100 WHERE id=1;
 UPDATE accounts SET balance=balance+100 WHERE id=2;
COMMIT;

Deadlock Handling

Application: catch deadlock exception. Retry: automatically (with backoff). Logging: track frequency. Investigation: identify pattern (lock ordering issue?).

Monitoring and Tools

DBMS tools: lock status, waits-for graph. Metrics: lock wait time, contention. Profiling: identify lock hotspots. Tuning: address bottlenecks.

Testing

Concurrency tests: multiple transactions. Deadlock injection: simulate failures. Load testing: measure performance. Correctness: verify serializability.

References

  • Ramakrishnan, R., and Gehrke, J. "Database Management Systems." McGraw-Hill, 3rd edition, 2003.
  • Garcia-Molina, H., Ullman, J. D., and Widom, J. "Database Systems: The Complete Book." Pearson, 2nd edition, 2008.
  • Silberschatz, A., Korth, H. F., and Sudarshan, S. "Database System Concepts." McGraw-Hill, 6th edition, 2010.
  • Bernstein, P. A., Hadzilacos, V., and Goodman, N. "Concurrency Control and Recovery in Database Systems." Addison-Wesley, 1987.
  • Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly Media, 2017.