Introduction

Binary search tree: binary tree with ordering property (left < parent < right). Enables efficient searching O(log n) average case. Foundational: basis for balanced trees (AVL, Red-Black). Disadvantage: no guarantees (can degenerate to list).

"Binary search trees elegantly combine tree structure with search capability. Their elegant simplicity hides a critical weakness: without balancing, they collapse into linear lists under adversarial input." -- Fundamental data structures

BST Definition

BST Property

For every node: all values in left subtree < node value < all values in right subtree. Recursive: property applies to all subtrees. Enables: binary search on tree.

Example

 15 / \ 10 20 / \ / \ 5 12 17 25Valid BST:- Left of 15: 10, 5, 12 (all < 15)- Right of 15: 20, 17, 25 (all > 15)- Recursive: subtrees also maintain property

Invalid BST

 15 / \ 20 10 (WRONG: 20 > 15 but on left)Not a BST: violates ordering property

BST Properties

In-Order Traversal

Left, root, right: produces sorted sequence. Essential property: reflects data ordering. Used: retrieve data in sorted order.

Height

Balanced: height ~ log n. Skewed: height ~ n (worst case). Average: height ~ log n (random insertion).

Node Count

n nodes, height h: n >= 2^h - 1 (full binary tree). Reverse: h >= log n (lower bound on height).

Insertion Operation

Algorithm

insert(node, key): if node == null: create new node if key < node.value: node.left = insert(node.left, key) else if key > node.value: node.right = insert(node.right, key) else: duplicate (no insert) return node

Example

Insert 12 into: 15 / \ 10 20Compare 12 with 15: 12 < 15, go leftCompare 12 with 10: 12 > 10, go rightInsert 12 as right child of 10 15 / \ 10 20 \ 12

Complexity

Best: O(1) (insert at leaf). Average: O(log n). Worst: O(n) (degenerate).

Deletion Operation

Three Cases

Case 1: Leaf node - delete directly. Case 2: One child - replace with child. Case 3: Two children - find successor/predecessor, replace, delete successor.

Example 1: Leaf

Delete 12 from: 15 / \ 10 20 \ 12Result: 15 / \ 10 20

Example 2: One Child

Delete 10 from: 15 / \ 10 20 \ 12Result: 15 / \ 12 20

Example 3: Two Children

Delete 15 from: 15 / \ 10 20 \ 12Find successor (smallest > 15): 20Replace 15 with 20, delete 20 20 / \ 10 (nothing) \ 12

Complexity

Best: O(1) (leaf, no balancing). Average: O(log n). Worst: O(n) (degenerate).

Successor and Predecessor

Successor

Next larger value. If right child exists: leftmost node in right subtree. Otherwise: ancestor (parent that node is left child of).

Predecessor

Next smaller value. If left child exists: rightmost node in left subtree. Otherwise: ancestor (parent that node is right child of).

Algorithm

successor(node): if node.right != null: return leftmost(node.right) else: return parent (going up tree)

Use in Deletion

Delete with two children: find successor, copy successor value to node, delete successor (has 0 or 1 child).

Tree Traversals

In-Order (L, Root, R)

produces sorted sequenceExample: 5, 10, 12, 15, 17, 20, 25

Pre-Order (Root, L, R)

root first: useful for copying treeExample: 15, 10, 5, 12, 20, 17, 25

Post-Order (L, R, Root)

root last: useful for deletionExample: 5, 12, 10, 17, 25, 20, 15

Level-Order (BFS)

breadth-first: level by levelExample: 15, 10, 20, 5, 12, 17, 25

Complexity

All traversals: O(n) (visit every node once).

Time Complexity

Summary Table

OperationBest CaseAverage CaseWorst Case
SearchO(1)O(log n)O(n)
InsertO(log n)O(log n)O(n)
DeleteO(log n)O(log n)O(n)
SpaceO(n)O(n)O(n)

Worst-Case Scenarios

Degenerate Tree

Insert: 1, 2, 3, 4, 5 (sorted sequence)Result: linked-list-like tree 1 \ 2 \ 3 \ 4 \ 5Height: 5 (n), not log(n)

Avoiding Degeneration

Random insertion: statistically balanced. Balanced variants: guarantee log n height (AVL, Red-Black). Avoid sorted input: can't assume random.

Why Worst Case Matters

Adversary: can deliberately insert sorted input. System critical: cannot accept O(n) worst case. Solution: use self-balancing trees.

Applications

Sorted Data Structures

Maintain sorted order: insertion O(log n) average. Retrieve in order: in-order traversal O(n).

File Systems

Directory structures: efficient file lookup. File allocation tables: tree-based organization.

Expression Trees

Parse expressions: operations as nodes. Evaluation: post-order traversal. Compilation: abstract syntax trees.

Database Indexes

B-trees (multi-way BSTs): practical database use. Billions of rows: tree structure enables fast lookup.

Self-Balancing Variants

AVL Trees

Height-balanced: height difference <= 1. Guarantee: O(log n) worst case. Rotation: rebalancing operations.

Red-Black Trees

Color-based: red/black node coloring. Guarantee: height <= 2 * log(n). Fewer rotations than AVL.

B-Trees

Multi-way: many children per node. Disk-optimized: minimize I/O. Database standard.

When to Use Basic BST

Theoretical: teaching data structures. Small datasets: n < 100. Random data: no adversarial input. Otherwise: use balanced variant.

References

  • Knuth, D. E. "The Art of Computer Programming: Sorting and Searching." Vol. 3, Addison-Wesley, 2nd edition, 1998.
  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. "Introduction to Algorithms." MIT Press, 3rd edition, 2009.
  • Sedgewick, R., and Wayne, K. "Algorithms." Addison-Wesley, 4th edition, 2011.
  • Weiss, M. A. "Data Structures and Problem Solving Using Java." Addison-Wesley, 4th edition, 2010.
  • Goodrich, M. T., Tamassia, R., and Goldwasser, M. H. "Data Structures and Algorithms in Java." Wiley, 6th edition, 2014.