Introduction
B-tree: self-balancing multi-way tree (Bayer and McCreight, 1972). Property: every node has multiple keys and children (not binary). Disk-optimized: minimizes disk I/O. Use: database indexes, file systems. Advantage: N-way branching reduces tree height. Cost: node complexity.
"B-trees revolutionized database design by recognizing that disk I/O dominates computation. By storing multiple keys per node, they minimize tree height and disk accesses, enabling efficient indexing of massive datasets." -- Database systems
B-Tree Definition
Formal Definition
B-tree of order m: tree where every node has at most m children. Keys: at most m-1 keys per node. Balanced: all leaves at same depth. Sorted: keys ordered (left < parent < right).
Key Distinction from Binary Trees
Binary: 2 children maximum. B-tree: m children maximum (hundreds possible). Keys per node: 1 key (binary), m-1 keys (B-tree). Branching: exponentially reduces height.
Example: B-Tree Order 3
Order m=3: max 2 keys, max 3 children 15 40 / | | \ 5 10|20 25|...Internal node: 2 keys (15, 40) and 3 pointersChildren: < 15, 15-40, > 40Comparison to Binary Tree
Binary: height for 1 million nodes = ~20 (log2(10^6)). B-tree (order 512): height = ~3 (log512(10^6)). Disk reads: binary ~20, B-tree ~3 (order-of-magnitude difference).
B-Tree Properties
Core Properties
1. Every node has at most m children. 2. Every internal node (except root) has at least ceil(m/2) children. 3. Root has at least 2 children (if not leaf). 4. All leaves at same depth. 5. Data in leaves or distributed (variant-dependent).
Property Implications
Balanced: height always O(log n). Minimum occupancy: ceil(m/2) (prevents sparse trees). Dense packing: efficient space utilization. Order selection: balances node size vs tree depth.
Leaf vs Internal Nodes
Different variants: data only in leaves (B-trees) or distributed (B+ trees). B-tree: data anywhere. B+ tree: data in leaves, internal nodes for indexing only.
Tree Structure and Nodes
Internal Node Structure
Node with k keys:[ptr0, key1, ptr1, key2, ptr2, ..., keyk, ptrk]Example (k=2):[ptr0, 15, ptr1, 40, ptr2]Pointers: ptr0 (< 15), ptr1 (15-40), ptr2 (> 40)Keys: ordered (key_i < key_{i+1})Node Splitting Threshold
Full node: m children (m-1 keys). Insert into full node: split. Empty node: ceil(m/2) children minimum (except root). Underflow: merge or rebalance.
Depth Analysis
Height h: at least h leaves. Minimum keys per non-root node: ceil(m/2) - 1. For n keys: height <= log(ceil(m/2))(n+1). Low height: few disk accesses.
Example Levels
Order m=512 (typical disk block size)Min children per node: 256Max keys per node: 511Level 0 (root): up to 511 keysLevel 1: up to 511 * 256 = 130K keysLevel 2: up to 130K * 256 = 33M keysLevel 3: up to 33M * 256 = 8B keysTypical database: needs only 3-4 levels (disk accesses)Search Operations
Search Algorithm
search(node, key): Find position i where keys[i-1] < key <= keys[i] If key == keys[i]: found If node is leaf: key not found Else: search(children[i], key) -- recursively search childComplexity
Comparisons per node: O(log m) (binary search on keys). Nodes visited: O(log n) (tree height). Total: O(log m * log n). In practice: dominated by disk I/O (few disk accesses).
Range Search
Find all keys in range [a, b]. Start: find first key >= a. Iterate: follow siblings/parent pointers. Cost: O(log n + k) where k = number of results (optimal).
Key Lookup Optimization
Cache nodes: frequently accessed nodes in memory. Prefetch: load sibling nodes. Index caching: root typically cached (frequent access).
Insertion Algorithm
Basic Algorithm
1. Search for leaf where key belongs2. Insert key in correct position (if not duplicate)3. If leaf has m-1 keys (full): - Split leaf into two - Move middle key up to parent4. If parent becomes full: - Recursively split parent5. If root splits: create new rootExample Insertion
Insert 17 into:[10 20] | [5 8] [12 15] [25 30]Find leaf for 17: goes to [12 15]Insert 17: [12 15 17]After insertion:[10 20] | [5 8] [12 15 17] [25 30]Split Operation
Insert 18 into full leaf [12 15 17 18]Leaf is full (4 keys, max 3): SPLITBefore: [12 15 17 18]Middle: 15After: [12] and [17 18]Move 15 up to parentParent [10 20] becomes [10 15 20] with 3 childrenCascading Splits
If parent becomes full: split parent. If grandparent becomes full: split grandparent. Cascades up to root. Creates new root if needed (tree grows by 1 level).
Deletion Algorithm
Deletion Cases
Case 1: Key in leaf - delete directly (if no underflow). Case 2: Key in internal node - replace with predecessor/successor, delete from leaf. Case 3: After deletion, node underflows - rebalance (borrow or merge).
Underflow Handling
Node has fewer than ceil(m/2)-1 keys: underflowSolution 1: Borrow from sibling If sibling has extra key: move through parentSolution 2: Merge with sibling Combine two nodes, pull key from parentExample Deletion
Delete 15 from [10 15 20]After deletion: [10 20] (2 keys)Check: min keys = ceil(3/2) - 1 = 1Status: [10 20] still valid (has 1 key minimum)Cascading Merges
If parent underflows after pulling key: merge parent. Cascades up to root. Root can have just 1 key (special case). If root becomes empty: delete root, tree shrinks by 1 level.
Complexity
Find: O(log n). Delete: O(1). Rebalancing: O(log n) splits/merges. Total: O(log n).
Node Splitting
Split Procedure
Split node with m children (full) and m-1 keys:1. Find middle key (at position ceil((m-1)/2))2. Create new node with upper half of keys3. Move middle key up to parent4. Parent insertion may trigger another splitExample: Order 5
Full node (5 children, 4 keys):[10 20 30 40] | | | | |Ptr Ptr Ptr Ptr PtrSplit (middle = 30):Left: [10 20] Right: [40]Move 30 to parentParent gains new child: ... 30 ... / \ / \ [10 20] [40]Balancing Property
After split: both nodes have roughly m/2 children. Maintains balance property. Min/max constraints respected.
Node Merging
Merge Procedure
Merge two nodes with few keys:1. Check if sibling has spare key (can borrow) If yes: move key through parent (rotation) If no: merge two nodes2. Pull key from parent (becomes separator)3. Combine keys and children from both nodes4. Delete empty nodeExample Rotation
Node [10] (underflow) with sibling [30 40]:Parent: [20]Borrow from right sibling:Left: [10 20] (20 moved from parent)Right: [40] (30 moved to parent)Parent: [30] (updated separator)Example Merge
Node [10] and sibling [30] (both minimal):Parent: [20]Merge:Combined: [10 20 30] (20 from parent pulled down)Parent loses 20: becomes empty/underflowsCascading Effect
Parent underflow: triggers parent merge. Cascades up. Root underflow: special case (becomes new root if empty).
Order Parameter (m)
Order Selection
Order m determines: node capacity, tree shape, performance. Typical: m = (disk block size / entry size). Example: block 4KB, entry 8 bytes -> m = 512.
Effects of Order
| Aspect | Small Order (m=3) | Large Order (m=512) |
|---|---|---|
| Tree height | Large (many levels) | Small (few levels) |
| Node size | Small (2 keys) | Large (511 keys) |
| Comparisons/node | Few (linear) | More (binary search) |
| Disk I/O | High (many accesses) | Low (few accesses) |
| Best for | Memory-based (RAM) | Disk-based (databases) |
Optimal Order
Minimize cost = tree height + node search cost. For disk: high order (minimize height, I/O dominates). For memory: lower order (minimize comparisons, CPU matters).
Time and Space Complexity
Time Complexity
| Operation | Complexity | Notes |
|---|---|---|
| Search | O(log n) | Tree height |
| Insert | O(log n) | Find + split |
| Delete | O(log n) | Find + merge |
| Range query | O(log n + k) | k = results |
Space Complexity
Storage: O(n) for n keys. Occupancy: 50% minimum (ceil(m/2) children). Effective storage: between 50%-100% full.
Practical Disk Analysis
Disk access dominates. For 1 billion keys with order 512: height ~4 levels = 4 disk reads. Memory reads: negligible compared to disk. B-trees designed specifically to minimize disk I/O.
Real-World Applications
Database Indexes
Primary use: relational databases (MySQL, PostgreSQL). Index: quickly locate rows. B+ tree variant: maintains sorted order, efficient range scans. Billions of records: practical with 3-4 tree levels.
File Systems
NTFS, ext4, HFS+: use B-trees for metadata. Fast directory lookup: few disk accesses. Efficient tree operations: insertion/deletion balanced.
Key-Value Stores
LSM trees (Bigtable, LevelDB): use B-tree-like structures. Sorted storage: efficient range queries. Compaction: reorganizes into B-tree structure.
Search Engines
Inverted indexes: search term to documents. B-tree variant: supports phrase queries. Index size: petabytes managed with B-tree structure.
NoSQL Databases
MongoDB, CouchDB: B-tree indexes for document lookup. Multi-key indexes: compound keys (multiple fields). Efficient: optimized for disk-based storage.
References
- Bayer, R., and McCreight, E. "Organization and Maintenance of Large Ordered Indexes." Acta Informatica, vol. 1, 1972, pp. 173-189.
- Knuth, D. E. "The Art of Computer Programming: Sorting and Searching." Vol. 3, Addison-Wesley, 2nd edition, 1998.
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. "Introduction to Algorithms." MIT Press, 3rd edition, 2009.
- Comer, D. "The Ubiquitous B-Tree." ACM Computing Surveys, vol. 11, no. 2, 1979, pp. 121-137.
- Ramakrishnan, R., and Gehrke, J. "Database Management Systems." McGraw-Hill, 3rd edition, 2003.