Introduction
Locking: mechanism ensuring concurrent access safety. Transactions acquire locks before access. Locks prevent: dirty reads, lost updates, race conditions. Cost: blocking (reduced concurrency). Balance: correctness vs. performance.
Core idea: mutual exclusion. Reader lock: multiple allowed. Writer lock: exclusive. Conflicts: prevent simultaneous access. Deadlock: potential risk (circular waiting). Management: system responsibility.
Traditional approach: databases use locking. Alternative: MVCC (multi-version concurrency control, modern). Choice: trade-off (simplicity vs. performance).
"Locking ensures isolation: prevents interference between concurrent transactions. Simple but can block: potential deadlock, reduced throughput. Modern systems: prefer MVCC but locking still essential." -- Concurrency control
Lock Types
Shared Lock (Read Lock)
Multiple transactions can hold simultaneously. Reading: non-conflicting. Prevents: dirty reads. Blocks: exclusive locks. Released: after read complete.
Exclusive Lock (Write Lock)
Only one transaction: holds exclusively. Writing: sole access. Blocks: all other locks (shared, exclusive). Released: after write complete (typically at commit).
Lock Conflict
Shared + Shared: compatible (both proceed). Shared + Exclusive: conflict (one waits). Exclusive + Exclusive: conflict (one waits). Resolution: queuing (first come, first served).
Wait Behavior
Request conflicting lock: transaction waits (blocks). Blocked: until lock released. Waiting queue: transactions queued fairly. Starvation prevention: careful ordering.
Lock Scope
Row-level: fine-grained (more concurrency). Table-level: coarse-grained (simpler). Mixed: adaptive (depends on selectivity). Choice: trade-off between granularity and overhead.
Exclusive Locks
Semantics
Single writer: sole access. All other transactions: blocked. Prevents: lost updates, dirty reads. Cost: blocking (reduced concurrency). Necessary: safety (correctness first).
Example
Transaction A (Writer): UPDATE accounts SET balance=100 WHERE id=1
- Acquires exclusive lock on row
- All other transactions blocked
Transaction B (Reader): SELECT * FROM accounts WHERE id=1
- Waits (blocked by exclusive lock)
Transaction C (Writer): UPDATE accounts SET balance=200 WHERE id=1
- Waits (blocked by exclusive lock)
(All wait until A commits, then B and C proceed)
Write-Heavy Workloads
Many writers: heavy contention. Blocking: significant (reduced throughput). Queue formation: writers wait. Bottleneck: serialization (loses parallelism).
Duration
Exclusive lock: held until commit (durability requirement). Cannot release early: dirty read risk. Duration: critical (impacts other transactions). Optimize: keep transactions short.
Hold Time
Long-running transaction: holds locks long. Other transactions: blocked (waiting). Throughput: reduced. Practice: minimize transaction duration (do work outside transaction).
Lock Compatibility Matrix
| Held Lock | Shared Requested | Exclusive Requested |
|---|---|---|
| No Lock | Granted | Granted |
| Shared Lock | Granted | Blocked (wait) |
| Exclusive Lock | Blocked (wait) | Blocked (wait) |
Interpretation
Multiple shared: always compatible (readers don't conflict). Exclusive: blocks all (sole access). Reading after writing: blocked until writer releases. Writing after reading: blocked until readers done.
Fairness
Queue: FIFO (fair). Exclusive requests: sometimes prioritized (prevent starvation). Systems vary: check implementation.
Lock Granularity
Granularity Levels
Database level: all tables (coarse). Table level: entire table. Page level: block of rows. Row level: single row (fine). Column level: rare (expensive).
Trade-offs
Coarse: fewer locks, less overhead, more blocking. Fine: more locks, higher overhead, better concurrency. Choice: workload-dependent. Sweet spot: row-level (balance).
Escalation
Start: row-level (many locks). Escalate: too many locks, promote to table (fewer locks). Automatic: system escalates. Risk: escalation reduces concurrency (more blocking).
Hierarchical Locking
Database > Table > Page > Row. Intention locks: announce intent at higher level (reading row needs intent-read on table). Prevents: conflicting higher-level locks.
Intent Lock Modes
Intent-Shared (IS): plan to read. Intent-Exclusive (IX): plan to write. Shared-Intent-Exclusive (SIX): read at this level, write below. Complex but necessary: prevent conflicts at wrong level.
Practical Implementation
Row-level common: good balance. Table-level: simple (less granular). Escalation: automatic (system decides). Monitoring: detect excessive locks.
Lock Modes and Variants
Pessimistic vs. Optimistic
Pessimistic: acquire lock before access (assume conflicts). Optimistic: detect conflicts at commit (assume no conflicts). Pessimistic: simpler, blocking. Optimistic: better concurrency, complex.
Implicit vs. Explicit
Implicit: system acquires automatically (transparent). Explicit: application requests (control). Implicit: easier. Explicit: fine-tuning possible.
Predicate Locks
Lock condition: WHERE age > 30. Prevents: phantom reads (new rows matching inserted). Complex: expensive. Rare: few systems implement.
Gap Locks
Lock gap between rows: prevent insertion. Prevents: phantom reads (selective). Range locks: equivalent. Modern: common in MVCC systems (less needed).
Next-Key Locks
Lock record + gap after: prevents phantoms within range. MySQL InnoDB: uses this. Combines: record lock + gap lock.
Deadlock Concepts
Definition
Circular wait: transactions waiting for each other. Example: A waits for B, B waits for A. Unresolvable: both blocked forever (unless detected).
Deadlock Scenario
Transaction A: Lock row 1 (exclusive), waits for row 2
Transaction B: Lock row 2 (exclusive), waits for row 1
Circular dependency: neither can proceed
Deadlock: system must detect and resolve
Detection
Waits-for graph: transactions are nodes, edges=waits-for. Cycle: deadlock detected. Periodically checked: typically every few seconds. Cost: cycle detection computation.
Resolution
Victim selection: one transaction aborted (rolled back). Victim released: locks released, other continues. Victim retry: application retries (transparent usually). Fair: avoid repeated victim selection.
Timeout Approach
Transaction times out: after waiting duration. Automatically aborted (rolled back). Simple: no cycle detection. Risk: timeout too short (false aborts), too long (unresponsive).
Prevention
Lock ordering: acquire locks in consistent order. No cycles: prevention. Cost: application discipline. Practical: difficult to enforce globally.
Lock Protocols
Two-Phase Locking (2PL)
Phase 1 (Growing): acquire locks, no releases. Phase 2 (Shrinking): release locks, no acquires. Ensures: serializability (correctness). Lock point: maximum locks held (dividing point).
Strict 2PL
Exclusive locks: held until commit (not released earlier). Prevents: dirty reads, cascading rollbacks. Standard: SQL databases (usually strict).
Conservative 2PL
Acquire all locks upfront (predeclare). No blocking after: proceeds or aborts. Prevents: deadlock (no cycles). Cost: over-locking (more conflicts), predeclare overhead.
Advantages
Serializability: guaranteed. Isolation: complete. Simplicity: straightforward protocol.
Disadvantages
Blocking: potential (reduces concurrency). Deadlock: possible (requires resolution). Cascading rollback: possible (without strict variant).
Practical Implementation
SQL databases: default to strict 2PL (usually automatic, transparent). Lock acquisition: implicit with data access. Release: at transaction boundary (commit/rollback).
Performance Impact
Lock Overhead
Lock acquisition: CPU cost (hash table lookup, queue management). Lock held: memory cost (data structure). Contention: blocking (wait cost). Total: depends on workload.
Throughput vs. Latency
Locking serializes: reduces throughput (fewer parallel). Latency: may increase (wait times). Trade: serialization safety for performance. Optimization: minimize lock hold time.
Contention Scenarios
Hot rows: frequently accessed. High lock contention: many transactions waiting. Bottleneck: serialization (loss of parallelism). Solution: denormalize, cache, or MVCC.
Optimization Strategies
Shorter transactions: minimize lock hold time. Fewer locks: access less data. Lower granularity: prevent unnecessary blocking. Indexes: reduce rows scanned.
Monitoring
Lock wait time: measure delays. Lock holder duration: identify slow transactions. Deadlock frequency: indicate protocol issues. Tuning: based on metrics.
Optimizations and Alternatives
Multi-Version Concurrency Control (MVCC)
Snapshot: reader sees version at start time. Writer: creates new version. No lock: high concurrency. Cost: version storage, cleanup (garbage collection).
Lock-Free Structures
Compare-and-swap: atomic operations. No locks: true parallelism. Cost: complexity (CAS retry loops). Limited use: simple data structures.
Partitioning
Divide data: separate locks per partition. Reduce contention: spreads load. Trade: query complexity (multi-partition queries). Effective: hot-spot mitigation.
Caching
Reduce database access: cache in memory. Fewer locks: fewer database queries. Simplification: lower concurrency pressure. Consistency: cache coherence (complex).
Read Replicas
Read from replicas: distribute load. Write: single source (consistent). Latency: eventual (replication lag). Benefit: read scalability.
When to Use Alternatives
Locking: sufficient for moderate contention. MVCC: high read/write mix. Lock-free: specialized (advanced). Partitioning: hot-spot mitigation. Caching: reduce load. Choose: based on analysis.
Practical Considerations
Application Design
Keep transactions short: minimize lock hold time. Access small data: reduce lock scope. Predictable order: prevent deadlock. Error handling: retry on deadlock.
SQL Examples
SELECT * FROM accounts WHERE id=1 FOR UPDATE;
(Explicit exclusive lock acquisition, MySQL/PostgreSQL)
BEGIN TRANSACTION;
UPDATE accounts SET balance=balance-100 WHERE id=1;
UPDATE accounts SET balance=balance+100 WHERE id=2;
COMMIT;
Deadlock Handling
Application: catch deadlock exception. Retry: automatically (with backoff). Logging: track frequency. Investigation: identify pattern (lock ordering issue?).
Monitoring and Tools
DBMS tools: lock status, waits-for graph. Metrics: lock wait time, contention. Profiling: identify lock hotspots. Tuning: address bottlenecks.
Testing
Concurrency tests: multiple transactions. Deadlock injection: simulate failures. Load testing: measure performance. Correctness: verify serializability.
References
- Ramakrishnan, R., and Gehrke, J. "Database Management Systems." McGraw-Hill, 3rd edition, 2003.
- Garcia-Molina, H., Ullman, J. D., and Widom, J. "Database Systems: The Complete Book." Pearson, 2nd edition, 2008.
- Silberschatz, A., Korth, H. F., and Sudarshan, S. "Database System Concepts." McGraw-Hill, 6th edition, 2010.
- Bernstein, P. A., Hadzilacos, V., and Goodman, N. "Concurrency Control and Recovery in Database Systems." Addison-Wesley, 1987.
- Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly Media, 2017.