Introduction
Binary search tree: binary tree with ordering property (left < parent < right). Enables efficient searching O(log n) average case. Foundational: basis for balanced trees (AVL, Red-Black). Disadvantage: no guarantees (can degenerate to list).
"Binary search trees elegantly combine tree structure with search capability. Their elegant simplicity hides a critical weakness: without balancing, they collapse into linear lists under adversarial input." -- Fundamental data structures
BST Definition
BST Property
For every node: all values in left subtree < node value < all values in right subtree. Recursive: property applies to all subtrees. Enables: binary search on tree.
Example
15 / \ 10 20 / \ / \ 5 12 17 25Valid BST:- Left of 15: 10, 5, 12 (all < 15)- Right of 15: 20, 17, 25 (all > 15)- Recursive: subtrees also maintain propertyInvalid BST
15 / \ 20 10 (WRONG: 20 > 15 but on left)Not a BST: violates ordering propertyBST Properties
In-Order Traversal
Left, root, right: produces sorted sequence. Essential property: reflects data ordering. Used: retrieve data in sorted order.
Height
Balanced: height ~ log n. Skewed: height ~ n (worst case). Average: height ~ log n (random insertion).
Node Count
n nodes, height h: n >= 2^h - 1 (full binary tree). Reverse: h >= log n (lower bound on height).
Search Operation
Algorithm
search(node, key): if node == null: return NOT_FOUND if key == node.value: return FOUND if key < node.value: return search(node.left, key) else: return search(node.right, key)Complexity
Best: O(1) (found at root). Average: O(log n) (balanced). Worst: O(n) (degenerate list).
Iterative Search
current = rootwhile current != null: if key == current.value: return FOUND if key < current.value: current = current.left else: current = current.rightreturn NOT_FOUNDInsertion Operation
Algorithm
insert(node, key): if node == null: create new node if key < node.value: node.left = insert(node.left, key) else if key > node.value: node.right = insert(node.right, key) else: duplicate (no insert) return nodeExample
Insert 12 into: 15 / \ 10 20Compare 12 with 15: 12 < 15, go leftCompare 12 with 10: 12 > 10, go rightInsert 12 as right child of 10 15 / \ 10 20 \ 12Complexity
Best: O(1) (insert at leaf). Average: O(log n). Worst: O(n) (degenerate).
Deletion Operation
Three Cases
Case 1: Leaf node - delete directly. Case 2: One child - replace with child. Case 3: Two children - find successor/predecessor, replace, delete successor.
Example 1: Leaf
Delete 12 from: 15 / \ 10 20 \ 12Result: 15 / \ 10 20Example 2: One Child
Delete 10 from: 15 / \ 10 20 \ 12Result: 15 / \ 12 20Example 3: Two Children
Delete 15 from: 15 / \ 10 20 \ 12Find successor (smallest > 15): 20Replace 15 with 20, delete 20 20 / \ 10 (nothing) \ 12Complexity
Best: O(1) (leaf, no balancing). Average: O(log n). Worst: O(n) (degenerate).
Successor and Predecessor
Successor
Next larger value. If right child exists: leftmost node in right subtree. Otherwise: ancestor (parent that node is left child of).
Predecessor
Next smaller value. If left child exists: rightmost node in left subtree. Otherwise: ancestor (parent that node is right child of).
Algorithm
successor(node): if node.right != null: return leftmost(node.right) else: return parent (going up tree)Use in Deletion
Delete with two children: find successor, copy successor value to node, delete successor (has 0 or 1 child).
Tree Traversals
In-Order (L, Root, R)
produces sorted sequenceExample: 5, 10, 12, 15, 17, 20, 25Pre-Order (Root, L, R)
root first: useful for copying treeExample: 15, 10, 5, 12, 20, 17, 25Post-Order (L, R, Root)
root last: useful for deletionExample: 5, 12, 10, 17, 25, 20, 15Level-Order (BFS)
breadth-first: level by levelExample: 15, 10, 20, 5, 12, 17, 25Complexity
All traversals: O(n) (visit every node once).
Time Complexity
Summary Table
| Operation | Best Case | Average Case | Worst Case |
|---|---|---|---|
| Search | O(1) | O(log n) | O(n) |
| Insert | O(log n) | O(log n) | O(n) |
| Delete | O(log n) | O(log n) | O(n) |
| Space | O(n) | O(n) | O(n) |
Worst-Case Scenarios
Degenerate Tree
Insert: 1, 2, 3, 4, 5 (sorted sequence)Result: linked-list-like tree 1 \ 2 \ 3 \ 4 \ 5Height: 5 (n), not log(n)Avoiding Degeneration
Random insertion: statistically balanced. Balanced variants: guarantee log n height (AVL, Red-Black). Avoid sorted input: can't assume random.
Why Worst Case Matters
Adversary: can deliberately insert sorted input. System critical: cannot accept O(n) worst case. Solution: use self-balancing trees.
Applications
Sorted Data Structures
Maintain sorted order: insertion O(log n) average. Retrieve in order: in-order traversal O(n).
File Systems
Directory structures: efficient file lookup. File allocation tables: tree-based organization.
Expression Trees
Parse expressions: operations as nodes. Evaluation: post-order traversal. Compilation: abstract syntax trees.
Database Indexes
B-trees (multi-way BSTs): practical database use. Billions of rows: tree structure enables fast lookup.
Self-Balancing Variants
AVL Trees
Height-balanced: height difference <= 1. Guarantee: O(log n) worst case. Rotation: rebalancing operations.
Red-Black Trees
Color-based: red/black node coloring. Guarantee: height <= 2 * log(n). Fewer rotations than AVL.
B-Trees
Multi-way: many children per node. Disk-optimized: minimize I/O. Database standard.
When to Use Basic BST
Theoretical: teaching data structures. Small datasets: n < 100. Random data: no adversarial input. Otherwise: use balanced variant.
References
- Knuth, D. E. "The Art of Computer Programming: Sorting and Searching." Vol. 3, Addison-Wesley, 2nd edition, 1998.
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. "Introduction to Algorithms." MIT Press, 3rd edition, 2009.
- Sedgewick, R., and Wayne, K. "Algorithms." Addison-Wesley, 4th edition, 2011.
- Weiss, M. A. "Data Structures and Problem Solving Using Java." Addison-Wesley, 4th edition, 2010.
- Goodrich, M. T., Tamassia, R., and Goldwasser, M. H. "Data Structures and Algorithms in Java." Wiley, 6th edition, 2014.