Introduction

Two-Phase Locking (2PL): protocol ensuring serializability. Divides transaction into two phases: growing (acquire locks), shrinking (release locks). Simple protocol, powerful guarantee. Foundation: database concurrency control.

Theorem: if all transactions follow 2PL, execution serializable (transaction appear sequential). Proof: lock-based prevents cycles in dependency graph. Standard: SQL databases implement (usually transparently).

Variants: Strict 2PL (standard), Rigorous 2PL (stronger), Conservative 2PL (prevented deadlock). Trade-offs: simplicity, deadlock risk, performance.

"Two-Phase Locking guarantees serializability through simple discipline: acquire before access, release after complete. Foundation: decades of proven correctness, trusted by production systems worldwide." -- Concurrency control

Basic Concept

Definition

Two phases: Growing (Phase 1) and Shrinking (Phase 2). Growing: acquire locks, no releases. Shrinking: release locks, no acquires. Lock point: transition (maximum locks held).

Example Timeline

Transaction: BEGIN
 Acquire lock on row 1 (growing phase)
 Acquire lock on row 2 (growing phase)
 Access rows 1, 2
 Lock point (maximum locks here)
 Release lock on row 1 (shrinking phase)
 Release lock on row 2 (shrinking phase)
COMMIT

Enforcement

Automatic: system enforces (usually transparent). Implicit: acquires at access, releases at phase boundary. Explicit: application can request (rare).

Property

Necessary condition: serializability. Sufficient: also guaranteed (if acyclic). Proof: conflict serializable => 2PL schedule possible.

Key Insight

Locks prevent conflicts: cycle prevention. No cycles: serializable. 2PL structure: prevents cycles (mathematical guarantee).

Growing Phase

Activities

Acquire locks: before data access (read or write). Multiple locks possible: different rows. Nested: hierarchical locks (intention locks). No releases: strict growing.

Lock Acquisition

Transaction needs to read row X:
 Acquire shared lock on X (if not already held)
 Proceed with read
 Lock retained for later release

Transaction needs to write row Y:
 Acquire exclusive lock on Y (if not already held)
 Proceed with write
 Lock retained for later release

Nested Transactions

Subtransactions: acquire locks (contributing to parent). Parent: responsible (releases all child locks at end). Complexity: lock management.

Waiting

Lock unavailable: transaction waits (blocked). Queue: others waiting for same lock. Fair: FIFO typically. Eventually granted: when lock released.

Lock Upgrade

Shared -> exclusive: upgrade. Risky: deadlock (other readers waiting). Some systems: prevent (fail upgrade). Workaround: release shared, request exclusive (risky window).

Duration

Locks held: until shrinking phase. Cost: blocking other transactions. Long growing: reduces concurrency. Optimize: shorter growing phase.

Shrinking Phase

Activities

Release locks: after access complete (no more reads/writes). Sequential: order doesn't matter (for correctness). All locks released: by commit/rollback.

Lock Release

Done accessing data: release locks
 Release lock on row 1
 Release lock on row 2
 (order irrelevant for correctness)

After all locks released: other transactions proceed
 Waiting transactions granted (if locks now available)

Cleanup

Rollback: release locks (revert changes). Commit: release locks (persist changes). Both: return to initial lock-free state.

Deadlock Resolution

Victim transaction: aborted (releases all locks). Awakens waiters: can now proceed. Victim retried: application (usually automatic).

Lock Cascade

Release lock on row X: waiting transaction (Z) granted. Z may release locks: cascades. Wave: propagates through queue. Efficiency: system dependent.

Timestamp

Lock point: dividing boundary. Before: growing (acquiring). After: shrinking (releasing). Critical: all operations before lock point (view consistent at point).

Lock Point

Definition

Moment: transaction transitions from growing to shrinking. Before: can acquire locks. After: can only release. Critical: all data accessed must have locks acquired by this point.

Significance

Snapshot: lock point represents consistent view. All reads: reflect state at lock point. Updates: consistent relative to this moment. Serializability: based on lock points (order).

Example

Transaction A:
 Acquire lock on account 1
 Acquire lock on account 2
 [LOCK POINT]
 Read account 1: $100
 Read account 2: $50
 (snapshot at lock point: consistent)

Transaction B (overlapping):
 Can update: only if no lock acquired by A
 If A acquired lock: B waits

Early Release

Release before commit: risky (dirty read possible if not careful). Standard: hold until commit (strict 2PL). Early: requires careful handling.

Ordering

Lock point order: determines serializability. Earlier point: seen as executing first. Conflict serializable: order derivable from lock points.

Strict Two-Phase Locking

Definition

Exclusive locks held until commit/rollback (not released early). Shared locks: may be released earlier (read-only, safe). Prevents: dirty reads, cascading rollback.

Improvement over Basic 2PL

Basic: might release exclusive lock early. Risk: other transaction reads released data (dirty read if rollback). Strict: prevents (holds until end).

Implementation

Exclusive lock: SET lock_until_end=true
Shared lock: SET lock_until_end=false (can release early)

On COMMIT/ROLLBACK: release all exclusive locks

Deadlock Guarantee

Strict 2PL: reduces deadlock cycles (less overlap). Cascading prevented: transaction isolation stronger. Cost: lock held longer.

Standard Practice

SQL databases: typically implement strict 2PL. Transparent: automatic. Safest: default assumption. Trade: acceptable performance cost.

Cascading Avoidance

Cascading rollback: uncommitted data read. If reader rolls back: cascade. Strict 2PL: prevents (no uncommitted reads). Stronger: deserves the cost.

Rigorous Two-Phase Locking

Definition

All locks (shared and exclusive): held until commit/rollback. Strictest variant. All phantom reads prevented (range locks held). Highest safety.

Difference from Strict

Strict: exclusive until end, shared can release. Rigorous: all until end (more locks held, less concurrency). Trade: absolute safety vs. performance.

Use Cases

Critical transactions: serializability essential. Complex multi-step: safety paramount. High-value operations: bank transfers, medical records. Accept: performance cost.

Implementation Cost

Many locks: higher overhead. Longer hold: reduced concurrency. Benefit: absolute guarantee (no anomalies). Worth: depends on requirements.

Practical Adoption

Rare: explicit (usually implicit in highest isolation). Systems: offer but not default. Performance: prohibitive for high-throughput.

Serializability Guarantee

Theorem

If all transactions follow 2PL, execution serializable. Proof: lock-based prevents conflict cycles. Graph acyclic: serialization order exists.

Conflict Serializable

Conflicts: read-write, write-write between transactions. Precedence graph: nodes=transactions, edges=conflicts. Cycle: non-serializable. 2PL: prevents cycles (graph acyclic).

Example Serialization

Transaction A, B, C: concurrent execution
2PL enforces: lock points order (A before B before C, say)
Serialization: equivalent to sequential A->B->C
Result: same as running A, then B, then C sequentially

Isolation Achievement

Serializability: strongest isolation definition. ACID: consistency guaranteed (serializable). 2PL: achieves (automatically).

Edge Cases

Phantom reads: prevented if range locks used (rigorous). Predicate locks: complex, rarely used. Standard: row-level locks (good enough).

Practical Guarantee

Developers: can assume serializability (with 2PL). Complex transactions: safe (isolation guaranteed). Correctness: ensured by protocol.

Deadlock Handling with 2PL

Deadlock Risk

Circular wait: possible even with 2PL. Example: A locks X, waits for Y; B locks Y, waits for X. Cycle: both blocked. Resolution: required.

Detection Mechanism

Waits-for graph: periodically checked. Cycle detected: deadlock confirmed. Victim selection: which to abort. Fairness: avoid repeated victim.

Resolution Strategies

Victim selection: youngest (lose less work), minimal locks held (less cleanup), random (simple). Abort: release all locks, rollback. Victim retries: transparently (usually).

Prevention Techniques

Lock ordering: consistent order prevents cycles. A always locks X before Y. B same order. No cycles possible (acyclic). Cost: application discipline.

Timeout Alternative

Transaction timeout: after waiting duration, aborted. Simple: no cycle detection. Risk: false aborts (timeout too short). Pragmatic: used in practice.

Example Avoidance

Lock ordering discipline:
Transaction A: acquire A1, then A2
Transaction B: acquire B1, then B2
Ensure: A1 < B1 and A2 < B2 (consistent ordering)
Result: no cycles, deadlock-free

Conservative Two-Phase Locking

Concept

Acquire all locks upfront (predeclare). If any unavailable: abort immediately (not block). Serialization: lock point at start (all locks acquired). Prevents: deadlock (no waiting).

Implementation

Transaction start: declare all locks needed
 Acquire lock on row 1: success
 Acquire lock on row 2: success
 Acquire lock on row 3: conflict, abort

Result: restart (don't wait)

Advantages

Deadlock-free: no waiting (no cycles). Lock point: at start (well-defined). Simple: straightforward logic.

Disadvantages

Over-locking: predeclare all, some unused. Aborts: likely if contention. Restarts: expensive. Practical: overhead high for typical applications.

Use Cases

Batch processing: know data in advance. Simple transactions: few rows. Low contention: predeclare acceptable. Rare: overhead usually prohibitive.

Comparison to Standard

Standard 2PL: acquire as needed (less over-locking). Conservative: upfront (deadlock-free). Trade: flexibility vs. guarantee.

Limitations and Trade-offs

Deadlock Possibility

2PL: permits deadlock (unlike conservative). Detection + resolution: required. Cost: overhead, victim loss (aborted work).

Reduced Concurrency

Locks block: limited parallelism. Contention: severe with many transactions. Throughput: reduced from maximum possible. Bottleneck: lock contention.

Long Transactions

Hold locks long: block others. Impact: cumulative (many transactions waiting). Optimization: keep transactions short (critical).

Cascading Rollback Risk

Basic 2PL: possible (strict variants prevent). Rollback: triggers others. Cascade: can be large. Cost: rework (expensive).

Performance Overhead

Lock management: CPU cost. Lock contention: wait time. Total: significant impact. Tuning: essential for high throughput.

When Alternatives Better

MVCC: high read/write mix, high concurrency. Optimistic locking: low contention. Lock-free: specialized data structures. Choose: based on workload.

2PL in Modern Systems

Explicit Use

SQL Server, Oracle: 2PL often used explicitly. SELECT ... FOR UPDATE: request locks. Transaction scope: automatic lock management.

Implicit Implementation

Transparent: system enforces (usually). Developer: unaware of 2PL (works automatically). Assumption: safe (isolation guaranteed).

MVCC Integration

Modern systems: MVCC + 2PL hybrid. Readers: MVCC (no locks, snapshots). Writers: 2PL (locks, serial). Benefit: read parallelism + write safety.

Performance Tuning

Monitor lock waits: identify contention. Optimize queries: reduce lock hold time. Indexing: reduce rows scanned. Transactions: keep short.

Practical Application

Developers: assume 2PL (correct). OLTP: default workload. OLAP: often disabled (snapshot isolation). Verify: system default (varies).

References

  • Eswaran, K. P., et al. "The Notions of Consistency and Predicate Locks in a Database System." Communications of the ACM, vol. 19, no. 11, 1976, pp. 624-633.
  • Ramakrishnan, R., and Gehrke, J. "Database Management Systems." McGraw-Hill, 3rd edition, 2003.
  • Garcia-Molina, H., Ullman, J. D., and Widom, J. "Database Systems: The Complete Book." Pearson, 2nd edition, 2008.
  • Silberschatz, A., Korth, H. F., and Sudarshan, S. "Database System Concepts." McGraw-Hill, 6th edition, 2010.
  • Bernstein, P. A., Hadzilacos, V., and Goodman, N. "Concurrency Control and Recovery in Database Systems." Addison-Wesley, 1987.