Introduction
Isolation level: degree of isolation between concurrent transactions. Spectrum: weak (dirty reads possible) to strong (complete isolation). Trade-off: consistency vs. concurrency. Default varies (PostgreSQL: read committed, MySQL: repeatable read).
Standard: SQL defines four levels. Implementation varies (MVCC, locking). Choice: application requirements. Higher level: safer but slower. Lower level: faster but risky.
Common mistake: assuming full isolation (ACID). Verify: database default. Critical data: explicitly set level. Non-critical: relax for performance.
"Isolation levels balance correctness and performance. Strong guarantees slow systems, weak enable speed but risk anomalies. Understanding: essential for correct database design." -- Concurrency control
Transaction Anomalies
Dirty Read
Read uncommitted data: later rolled back. Transaction sees phantom value. Consistency violation: application assumes value persists (false). Rare acceptable: lowest isolation only.
Non-Repeatable Read
Same read twice: different results (updated by other transaction). Inconsistent view: value changed during transaction. Not unsafe: just variable. Acceptable: many applications.
Phantom Read
Range query twice: different result set (rows inserted/deleted by other). Ghost rows: appear/disappear. Query intent violated: range changed. Problematic: aggregate queries, pagination.
Lost Update
Two writes: one overwrites other. Both transactions: believe success. But: one update lost. Correctness violation: data corrupted. Prevented: all isolation levels (basic guarantee).
Anomaly Hierarchy
Dirty > Non-Repeatable > Phantom (in terms of severity). None: serializable (safest). Some: weaker levels (faster). Application decides: tolerance level.
Read Uncommitted
Definition
No isolation: transactions see uncommitted writes. Read any data: committed or not. Fastest: no locking needed. Unsafe: anomalies common.
Anomalies Possible
Dirty reads: yes. Non-repeatable reads: yes. Phantom reads: yes. Lost updates: no (implicit protection). Only use: non-critical, speed essential.
Implementation
No locks: writes visible immediately. Readers: don't block writers. Writers: don't block readers. Fastest possible: trade safety.
Use Cases
Real-time statistics: approximate acceptable. Cache: stale data tolerated. Reporting on production: data not critical. Rare: risky for most applications.
SQL Syntax
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- or MySQL default for some operations
Read Committed
Definition
Read only committed data. No dirty reads. Non-repeatable reads possible (value updated). Phantom reads possible (rows added/deleted). Good balance: safety and performance.
Implementation
Shared locks: held during read (released after). Exclusive locks: held until commit. Prevents: dirty reads (exclusive block reads). Allows: non-repeatable (lock released).
Anomalies Possible
Dirty reads: no. Non-repeatable reads: yes (likely). Phantom reads: yes. Lost updates: no.
Example Scenario
Transaction A: BEGIN
Transaction A: SELECT balance FROM accounts WHERE id=1
(gets committed value: $100)
Transaction B: BEGIN
Transaction B: UPDATE accounts SET balance=150 WHERE id=1
Transaction B: COMMIT
Transaction A: SELECT balance FROM accounts WHERE id=1
(gets new value: $150, non-repeatable read)
Popularity
PostgreSQL default. Most OLTP systems: default. Good for: high concurrency, mostly safe. Trade-off: acceptable anomalies for performance.
SQL Syntax
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
Repeatable Read
Definition
Same row: consistent value throughout transaction. No non-repeatable reads. Phantom reads possible (new rows appear). Intermediate: more safety than read committed.
Implementation
Shared locks: held until commit (not released early). Ensures: same row read twice = same value. Range locks rare: phantoms possible.
Anomalies Possible
Dirty reads: no. Non-repeatable reads: no. Phantom reads: yes. Lost updates: no.
Example Scenario
Transaction A: BEGIN
Transaction A: SELECT COUNT(*) FROM accounts WHERE status='active'
(result: 10 rows)
Transaction B: INSERT into accounts VALUES (..., 'active')
Transaction B: COMMIT
Transaction A: SELECT COUNT(*) FROM accounts WHERE status='active'
(result: 11 rows, phantom read)
Use Cases
Financial reporting: consistent values within report (but new rows okay). Inventory: read quantities consistently (but additions acceptable). Balance: safety for critical, performance acceptable.
SQL Syntax
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
MySQL Default
MySQL InnoDB: repeatable read default (with next-key locks preventing phantoms).
Serializable
Definition
Complete isolation: transactions serialized (appear executed one after another). No anomalies: safest. Slowest: maximum locking.
Implementation
Range locks: prevent phantoms. Highest lock level: significant blocking. Alternatively: MVCC (snapshot isolation) with conflict detection.
Anomalies Possible
Dirty reads: no. Non-repeatable reads: no. Phantom reads: no. Lost updates: no. Complete safety.
Example
Transaction A: BEGIN
Transaction A: SELECT * FROM accounts
Transaction A: SELECT COUNT(*)
(subsequent inserts blocked, exact count repeatable)
Performance Cost
Heavy locking: reduced concurrency. Serialization: bottleneck. Use: only when necessary (strict requirements). Rare: most applications accept lower level.
SQL Syntax
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
When to Use
Banking: critical transfers. Medical records: legal requirement. Stock trading: consistency essential. Complex reports: multi-step calculations requiring snapshot.
Isolation Levels Comparison
| Level | Dirty Read | Non-Rep. Read | Phantom | Performance |
|---|---|---|---|---|
| Read Uncommitted | Yes | Yes | Yes | Fastest |
| Read Committed | No | Yes | Yes | Fast |
| Repeatable Read | No | No | Yes | Moderate |
| Serializable | No | No | No | Slowest |
Practical Selection
Start: database default (usually read committed). Monitor: verify anomalies acceptable. Increase: if problems observed. Trade-off: incremental, risk-based adjustment.
SQL Standard Definitions
Formal Specification
SQL standard: four levels with specific anomaly guarantees. Databases: follow standard (mostly). Implementation details: vary (MVCC vs. locking). Semantics: aim same (prevention specified).
Phantom Read Definition
Second query in range: returns additional rows (not present in first). Rows inserted by other transaction. Defined: on specific range query.
Non-Repeatable Read Definition
Specific row: value changes between reads. Other transaction: modified row. Defined: same key, different value.
Dirty Read Definition
Read value: other transaction modifying (not committed). Later: other transaction rolls back. Read value: phantom (never committed).
Implementation Variation
PostgreSQL, MySQL: not strictly follows standard. Snapshot isolation: PostgreSQL serializable (not true serializability). MySQL: variations. Verify: specific database behavior.
Dirty Read Example
Account balance: $100
Transaction A (Reader):
SELECT balance FROM accounts WHERE id=1
(reads: $200, uncommitted value)
Transaction B (Writer):
UPDATE accounts SET balance=200 WHERE id=1
(not yet committed)
Transaction A:
balance calculation: $200 * 0.1 = $20
Transaction B:
ROLLBACK (change reverted, balance = $100)
Problem: Transaction A used phantom value ($200)
Result: incorrect calculation based on non-existent state
Risk
Data corruption: calculations based on false data. Consistency: violated. Cascading: downstream systems trust false value.
Non-Repeatable Read Example
Employee salary: $50,000
Transaction A (Reader):
BEGIN
SELECT salary FROM employees WHERE id=1
(reads: $50,000)
Transaction B (Writer):
UPDATE employees SET salary=60,000 WHERE id=1
COMMIT
Transaction A:
SELECT salary FROM employees WHERE id=1
(reads: $60,000, different value!)
COMMIT
Problem: Same query, different result within transaction
Risk: Inconsistent view, recalculation errors
When Problematic
Multi-step calculations: same value needed throughout. Comparisons: value changes (unexpected). Auditing: trail shows different values (audit trail issues).
Phantom Read Example
Accounts with status='active': 10
Transaction A (Reporter):
BEGIN
SELECT COUNT(*) FROM accounts WHERE status='active'
(result: 10)
Transaction B (Data Entry):
INSERT INTO accounts VALUES (..., 'active')
COMMIT
Transaction A:
SELECT COUNT(*) FROM accounts WHERE status='active'
(result: 11, phantom row)
SELECT * FROM accounts WHERE status='active'
(returns 11 rows, not 10)
COMMIT
Problem: Set size changed (rows added)
Risk: Aggregate mismatch (10 vs. 11), pagination issues (row 11 unexpected)
Pagination Impact
Page 1: rows 1-10. Later query: rows 1-11 (row 11 phantom). Pagination: off-by-one errors. User sees: inconsistency (rows seem duplicated across pages).
Choosing Isolation Level
Criteria
Data criticality: high = higher isolation. Concurrency requirements: high = lower isolation. Anomaly tolerance: known issues = lower level okay. Performance: balance with requirements.
Decision Process
1. Understand data: critical? 2. Identify anomalies: tolerable? 3. Test workload: benchmark isolation levels. 4. Choose: balance safety and performance. 5. Monitor: adjust if needed.
Default Strategy
Start: database default (read committed usually). Use: unless known issues. Increase: only if anomalies detected. Avoid: premature serializable (performance cost high).
Explicit Setting
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
-- transaction code
COMMIT;
Per-Statement Setting
Some systems: isolate specific queries (advanced). Usually: transaction-level (simpler). Verify: database capability.
Monitoring and Tuning
Log anomalies: detect if occurring. Performance: check if bottleneck. Tune: incrementally adjust level. Test: verify changes work.
References
- Ramakrishnan, R., and Gehrke, J. "Database Management Systems." McGraw-Hill, 3rd edition, 2003.
- Garcia-Molina, H., Ullman, J. D., and Widom, J. "Database Systems: The Complete Book." Pearson, 2nd edition, 2008.
- Silberschatz, A., Korth, H. F., and Sudarshan, S. "Database System Concepts." McGraw-Hill, 6th edition, 2010.
- ISO/IEC 9075-1:2016 Information Technology - Database Languages - SQL - Part 1: Framework.
- Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly Media, 2017.