Introduction

Isolation level: degree of isolation between concurrent transactions. Spectrum: weak (dirty reads possible) to strong (complete isolation). Trade-off: consistency vs. concurrency. Default varies (PostgreSQL: read committed, MySQL: repeatable read).

Standard: SQL defines four levels. Implementation varies (MVCC, locking). Choice: application requirements. Higher level: safer but slower. Lower level: faster but risky.

Common mistake: assuming full isolation (ACID). Verify: database default. Critical data: explicitly set level. Non-critical: relax for performance.

"Isolation levels balance correctness and performance. Strong guarantees slow systems, weak enable speed but risk anomalies. Understanding: essential for correct database design." -- Concurrency control

Transaction Anomalies

Dirty Read

Read uncommitted data: later rolled back. Transaction sees phantom value. Consistency violation: application assumes value persists (false). Rare acceptable: lowest isolation only.

Non-Repeatable Read

Same read twice: different results (updated by other transaction). Inconsistent view: value changed during transaction. Not unsafe: just variable. Acceptable: many applications.

Phantom Read

Range query twice: different result set (rows inserted/deleted by other). Ghost rows: appear/disappear. Query intent violated: range changed. Problematic: aggregate queries, pagination.

Lost Update

Two writes: one overwrites other. Both transactions: believe success. But: one update lost. Correctness violation: data corrupted. Prevented: all isolation levels (basic guarantee).

Anomaly Hierarchy

Dirty > Non-Repeatable > Phantom (in terms of severity). None: serializable (safest). Some: weaker levels (faster). Application decides: tolerance level.

Read Uncommitted

Definition

No isolation: transactions see uncommitted writes. Read any data: committed or not. Fastest: no locking needed. Unsafe: anomalies common.

Anomalies Possible

Dirty reads: yes. Non-repeatable reads: yes. Phantom reads: yes. Lost updates: no (implicit protection). Only use: non-critical, speed essential.

Implementation

No locks: writes visible immediately. Readers: don't block writers. Writers: don't block readers. Fastest possible: trade safety.

Use Cases

Real-time statistics: approximate acceptable. Cache: stale data tolerated. Reporting on production: data not critical. Rare: risky for most applications.

SQL Syntax

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- or MySQL default for some operations

Read Committed

Definition

Read only committed data. No dirty reads. Non-repeatable reads possible (value updated). Phantom reads possible (rows added/deleted). Good balance: safety and performance.

Implementation

Shared locks: held during read (released after). Exclusive locks: held until commit. Prevents: dirty reads (exclusive block reads). Allows: non-repeatable (lock released).

Anomalies Possible

Dirty reads: no. Non-repeatable reads: yes (likely). Phantom reads: yes. Lost updates: no.

Example Scenario

Transaction A: BEGIN
Transaction A: SELECT balance FROM accounts WHERE id=1
 (gets committed value: $100)

Transaction B: BEGIN
Transaction B: UPDATE accounts SET balance=150 WHERE id=1
Transaction B: COMMIT

Transaction A: SELECT balance FROM accounts WHERE id=1
 (gets new value: $150, non-repeatable read)

Popularity

PostgreSQL default. Most OLTP systems: default. Good for: high concurrency, mostly safe. Trade-off: acceptable anomalies for performance.

SQL Syntax

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

Repeatable Read

Definition

Same row: consistent value throughout transaction. No non-repeatable reads. Phantom reads possible (new rows appear). Intermediate: more safety than read committed.

Implementation

Shared locks: held until commit (not released early). Ensures: same row read twice = same value. Range locks rare: phantoms possible.

Anomalies Possible

Dirty reads: no. Non-repeatable reads: no. Phantom reads: yes. Lost updates: no.

Example Scenario

Transaction A: BEGIN
Transaction A: SELECT COUNT(*) FROM accounts WHERE status='active'
 (result: 10 rows)

Transaction B: INSERT into accounts VALUES (..., 'active')
Transaction B: COMMIT

Transaction A: SELECT COUNT(*) FROM accounts WHERE status='active'
 (result: 11 rows, phantom read)

Use Cases

Financial reporting: consistent values within report (but new rows okay). Inventory: read quantities consistently (but additions acceptable). Balance: safety for critical, performance acceptable.

SQL Syntax

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

MySQL Default

MySQL InnoDB: repeatable read default (with next-key locks preventing phantoms).

Serializable

Definition

Complete isolation: transactions serialized (appear executed one after another). No anomalies: safest. Slowest: maximum locking.

Implementation

Range locks: prevent phantoms. Highest lock level: significant blocking. Alternatively: MVCC (snapshot isolation) with conflict detection.

Anomalies Possible

Dirty reads: no. Non-repeatable reads: no. Phantom reads: no. Lost updates: no. Complete safety.

Example

Transaction A: BEGIN
Transaction A: SELECT * FROM accounts
Transaction A: SELECT COUNT(*)
(subsequent inserts blocked, exact count repeatable)

Performance Cost

Heavy locking: reduced concurrency. Serialization: bottleneck. Use: only when necessary (strict requirements). Rare: most applications accept lower level.

SQL Syntax

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

When to Use

Banking: critical transfers. Medical records: legal requirement. Stock trading: consistency essential. Complex reports: multi-step calculations requiring snapshot.

Isolation Levels Comparison

Level Dirty Read Non-Rep. Read Phantom Performance
Read Uncommitted Yes Yes Yes Fastest
Read Committed No Yes Yes Fast
Repeatable Read No No Yes Moderate
Serializable No No No Slowest

Practical Selection

Start: database default (usually read committed). Monitor: verify anomalies acceptable. Increase: if problems observed. Trade-off: incremental, risk-based adjustment.

SQL Standard Definitions

Formal Specification

SQL standard: four levels with specific anomaly guarantees. Databases: follow standard (mostly). Implementation details: vary (MVCC vs. locking). Semantics: aim same (prevention specified).

Phantom Read Definition

Second query in range: returns additional rows (not present in first). Rows inserted by other transaction. Defined: on specific range query.

Non-Repeatable Read Definition

Specific row: value changes between reads. Other transaction: modified row. Defined: same key, different value.

Dirty Read Definition

Read value: other transaction modifying (not committed). Later: other transaction rolls back. Read value: phantom (never committed).

Implementation Variation

PostgreSQL, MySQL: not strictly follows standard. Snapshot isolation: PostgreSQL serializable (not true serializability). MySQL: variations. Verify: specific database behavior.

Dirty Read Example

Account balance: $100

Transaction A (Reader):
 SELECT balance FROM accounts WHERE id=1
 (reads: $200, uncommitted value)

Transaction B (Writer):
 UPDATE accounts SET balance=200 WHERE id=1
 (not yet committed)

Transaction A:
 balance calculation: $200 * 0.1 = $20

Transaction B:
 ROLLBACK (change reverted, balance = $100)

Problem: Transaction A used phantom value ($200)
Result: incorrect calculation based on non-existent state

Risk

Data corruption: calculations based on false data. Consistency: violated. Cascading: downstream systems trust false value.

Non-Repeatable Read Example

Employee salary: $50,000

Transaction A (Reader):
 BEGIN
 SELECT salary FROM employees WHERE id=1
 (reads: $50,000)

Transaction B (Writer):
 UPDATE employees SET salary=60,000 WHERE id=1
 COMMIT

Transaction A:
 SELECT salary FROM employees WHERE id=1
 (reads: $60,000, different value!)
 COMMIT

Problem: Same query, different result within transaction
Risk: Inconsistent view, recalculation errors

When Problematic

Multi-step calculations: same value needed throughout. Comparisons: value changes (unexpected). Auditing: trail shows different values (audit trail issues).

Phantom Read Example

Accounts with status='active': 10

Transaction A (Reporter):
 BEGIN
 SELECT COUNT(*) FROM accounts WHERE status='active'
 (result: 10)

Transaction B (Data Entry):
 INSERT INTO accounts VALUES (..., 'active')
 COMMIT

Transaction A:
 SELECT COUNT(*) FROM accounts WHERE status='active'
 (result: 11, phantom row)
 SELECT * FROM accounts WHERE status='active'
 (returns 11 rows, not 10)
 COMMIT

Problem: Set size changed (rows added)
Risk: Aggregate mismatch (10 vs. 11), pagination issues (row 11 unexpected)

Pagination Impact

Page 1: rows 1-10. Later query: rows 1-11 (row 11 phantom). Pagination: off-by-one errors. User sees: inconsistency (rows seem duplicated across pages).

Choosing Isolation Level

Criteria

Data criticality: high = higher isolation. Concurrency requirements: high = lower isolation. Anomaly tolerance: known issues = lower level okay. Performance: balance with requirements.

Decision Process

1. Understand data: critical? 2. Identify anomalies: tolerable? 3. Test workload: benchmark isolation levels. 4. Choose: balance safety and performance. 5. Monitor: adjust if needed.

Default Strategy

Start: database default (read committed usually). Use: unless known issues. Increase: only if anomalies detected. Avoid: premature serializable (performance cost high).

Explicit Setting

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
 -- transaction code
COMMIT;

Per-Statement Setting

Some systems: isolate specific queries (advanced). Usually: transaction-level (simpler). Verify: database capability.

Monitoring and Tuning

Log anomalies: detect if occurring. Performance: check if bottleneck. Tune: incrementally adjust level. Test: verify changes work.

References

  • Ramakrishnan, R., and Gehrke, J. "Database Management Systems." McGraw-Hill, 3rd edition, 2003.
  • Garcia-Molina, H., Ullman, J. D., and Widom, J. "Database Systems: The Complete Book." Pearson, 2nd edition, 2008.
  • Silberschatz, A., Korth, H. F., and Sudarshan, S. "Database System Concepts." McGraw-Hill, 6th edition, 2010.
  • ISO/IEC 9075-1:2016 Information Technology - Database Languages - SQL - Part 1: Framework.
  • Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly Media, 2017.