Introduction
AVL tree: self-balancing binary search tree (named after Adelson-Velsky and Landis). Property: height difference between left/right subtrees <= 1 (balanced). Guarantee: O(log n) operations (search, insert, delete). Trade-off: rotation overhead vs guaranteed performance.
"AVL trees prevent the worst-case degeneration of binary search trees into linked lists. Self-balancing ensures logarithmic performance even in adversarial input scenarios, at the cost of rotation overhead." -- Data structures
AVL Tree Definition
AVL Property
Binary search tree: standard BST ordering (left < parent < right). Balance condition: |height(left) - height(right)| <= 1 for all nodes. Recursive: applies to all subtrees.
Height Definition
Height: longest path to leaf (edges). Empty tree: height -1. Single node: height 0. Height increases by 1 per level.
Example Structure
20 (height 2, balance 0) / \ 10 30 (heights 1, balance 0) / \ \ 5 15 40 (leaf heights 0)Balance factors: 20(+0), 10(+0), 30(-1), 5(0), 15(0), 40(0)All within [-1, 1]: AVL property satisfiedNon-AVL Example
1 (height 3) \ 2 (height 2) \ 3 (height 1) \ 4 (height 0)Balance factors: 1(-2): VIOLATED (|−2| > 1)This is NOT an AVL tree (degenerated to linked list)Balance Factor
Definition
Balance factor (BF): height(left) - height(right). Valid range: [-1, 0, +1]. Out of range: triggers rebalancing.
Interpretation
BF = +1: left subtree is higher (left-heavy)BF = 0: balancedBF = -1: right subtree is higher (right-heavy)BF > +1: severely left-heavy (needs rotation)BF < -1: severely right-heavy (needs rotation)Tracking During Operations
Insertion: update BF from inserted node upward. Propagate: check each ancestor's BF. Rebalance: if BF < -1 or BF > +1.
Example Calculation
Node with left subtree (height 2) and right subtree (height 1):BF = 2 - 1 = +1 (valid)Node with left subtree (height 3) and right subtree (height 1):BF = 3 - 1 = +2 (invalid, needs rotation)Rotation Operations
Right Rotation
Before: C After: B / \ / \ B D A C / \ / \ A E E DPurpose: fixes left-heavy imbalanceEffect: reduces left subtree height, increases rightLeft Rotation
Before: A After: B \ / \ B A C / \ / \ E C E DPurpose: fixes right-heavy imbalanceEffect: reduces right subtree height, increases leftLeft-Right Rotation
Left on left child, then right on root: C C B / / / \ A → B → A C \ / B APurpose: fixes zig-zag (left child right-heavy)Right-Left Rotation
Right on right child, then left on root: A A B \ \ / \ C → B → A C / \ B CPurpose: fixes zig-zag (right child left-heavy)Rotation Cost
Time: O(1) per rotation (pointer updates). Space: O(1) (in-place). Cascading: up to O(log n) rotations per operation.
Insertion with Rebalancing
Algorithm
1. Insert node (standard BST insertion)2. Update heights from inserted node upward3. Check balance factors moving up tree4. If BF < -1 or BF > +1: - Identify imbalance type (LL, LR, RR, RL) - Apply appropriate rotation(s)5. Continue up to rootCase 1: Left-Left (LL)
Imbalance: new node inserted in left subtree of left child C (BF=2) B (BF=0) / / \ B (BF=1) → A C /A (new)Solution: Right rotation on CCase 2: Right-Right (RR)
Imbalance: new node inserted in right subtree of right child A (BF=-2) B (BF=0) \ / \ B (BF=-1) → A C \ C (new)Solution: Left rotation on ACase 3: Left-Right (LR)
Imbalance: new node in right subtree of left child C (BF=2) C (BF=2) B (BF=0) / / / \ A (BF=-1) → B → A C \ / B (new) ASolution: Left rotation on A, then right rotation on CCase 4: Right-Left (RL)
Imbalance: new node in left subtree of right child A (BF=-2) A (BF=-2) B (BF=0) \ \ / \ C (BF=1) → B → A C / \ B (new) CSolution: Right rotation on C, then left rotation on ATime Complexity
Find position: O(log n). Insert: O(1). Rebalancing: O(log n) rotations (worst case). Total: O(log n).
Deletion with Rebalancing
Algorithm
1. Delete node (standard BST deletion) - Leaf: just remove - One child: replace with child - Two children: find successor, replace, delete successor2. Update heights from parent of deleted upward3. Check balance factors, rebalance as needed4. Continue until rootComplexity and Cascading
Find: O(log n). Delete node: O(1). Rebalancing: up to O(log n) rotations (worse than insertion, single rebalance insufficient). Total: O(log n).
Example Deletion
30 / \ 20 40 (delete 20) / \ 10 25After deletion: 30 / \ 25 40 (now BF(30) = -1, still valid)Different case (deletion causes imbalance): 20 / \ 10 30 (delete 10) / 5After deletion: 20 \ 30 (BF(20) = -2, needs rebalancing)Search Operations
Search Algorithm
Standard BST search (no changes for AVL):1. Start at root2. If target == current: found3. If target < current: search left4. If target > current: search right5. If reach null: not foundTime: O(log n) (guaranteed by balance)Guaranteed Logarithmic Time
AVL property ensures height = O(log n). Search follows one path (height-long). Cost: O(log n) comparisons.
Finding Min/Max
Min: always go left. Max: always go right. Cost: O(log n) (height-limited).
Range Queries
Find all nodes between two values. Cost: O(log n + k) where k = number of results (optimal).
Height Properties
Maximum Height Theorem
For AVL tree with n nodes: height <= 1.44 * log2(n + 1). Proof: minimum nodes for height h grows exponentially (Fibonacci-like).
Minimum Nodes Formula
N(h): minimum nodes for height h. N(0)=1, N(1)=2, N(h)=N(h-1)+N(h-2)+1. Similar to Fibonacci (about 1.618^h growth).
Example
Height 0: min 1 node (just root)Height 1: min 2 nodes (root + left or right)Height 2: min 4 nodes (1+2+1)Height 3: min 7 nodes (1+4+2)Height 4: min 12 nodes (1+7+4)For 1 million nodes: max height ~20Comparison to Other Trees
Unbalanced BST: height could be n (worst case). AVL: height always log n (factor of n better). Red-Black: height 2 * log n (slightly worse than AVL, faster rotation).
Time Complexity Analysis
Operation Summary
| Operation | Best Case | Average Case | Worst Case |
|---|---|---|---|
| Search | O(1) | O(log n) | O(log n) |
| Insert | O(log n) | O(log n) | O(log n) |
| Delete | O(log n) | O(log n) | O(log n) |
| Space | O(n) | O(n) | O(n) |
Why Guaranteed Logarithmic
Balanced: height always O(log n). Every operation follows one path (height-long). Rebalancing cost: O(log n) rotations (still logarithmic total).
Rotation Overhead
Worst case: O(log n) rotations per insertion/deletion. Each rotation: O(1). Total: O(log n) per operation (acceptable trade-off).
Detailed Rotation Cases
Single Right Rotation (LL Case)
Before: z(+2) After: y / \ / \ y T4 x z / \ → / \ / \ x T3 T1 T2 T3 T4 / \ T1 T2Code: z.left = y.right y.right = z update heights return ySingle Left Rotation (RR Case)
Before: x(-2) After: y / \ / \ T1 y x z / \ → / \ / \ T2 z T1 T2 T3 T4 / \ T3 T4Code: x.right = y.left y.left = x update heights return yLeft-Right Rotation (LR Case)
Two rotations: left rotation on left child, then right rotation on root. Handles zig-zag configuration (left-child right-heavy).
Right-Left Rotation (RL Case)
Two rotations: right rotation on right child, then left rotation on root. Handles zig-zag configuration (right-child left-heavy).
Implementation Patterns
Node Structure
class AVLNode: value: data left: AVLNode (or null) right: AVLNode (or null) height: int (for balance calculations)Essential: height field for efficient BF calculationHelper Functions
height(node): return node.height or 0 if nullbalance_factor(node): height(left) - height(right)update_height(node): node.height = 1 + max(height(left), height(right))rotate_right(z): perform right rotation update heights return new rootrotate_left(z): perform left rotation update heights return new rootInsert Pattern
insert(node, value): if node is null: create new node if value < node.value: node.left = insert(node.left, value) else if value > node.value: node.right = insert(node.right, value) else: return (duplicate, no insert) update_height(node) bf = balance_factor(node) if bf > 1: return left-rotate or LR-rotate if bf < -1: return right-rotate or RL-rotate return nodeComparison with Other Trees
AVL vs Red-Black Trees
| Property | AVL | Red-Black |
|---|---|---|
| Balance | Stricter (height diff ≤ 1) | Looser (color-based) |
| Height | O(log n) (tighter) | O(log n) (looser) |
| Search | Faster (fewer comparisons) | Same (O log n) |
| Insert/Delete | More rotations | Fewer rotations |
| Use | Read-heavy | Write-heavy |
AVL vs BST
Unbalanced BST: O(n) worst case (linear). AVL: O(log n) guaranteed. Cost: rotation overhead (negligible compared to O(n)).
When to Use AVL
Search-heavy workloads. Predictable performance needed. Sorted iteration common. Examples: file systems, databases indexes.
References
- Adelson-Velsky, G., and Landis, E. "An Algorithm for the Organization of Information." Proceedings of the USSR Academy of Sciences, vol. 146, 1962, pp. 263-266.
- Knuth, D. E. "The Art of Computer Programming: Sorting and Searching." Vol. 3, Addison-Wesley, 2nd edition, 1998.
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. "Introduction to Algorithms." MIT Press, 3rd edition, 2009.
- Sedgewick, R., and Wayne, K. "Algorithms." Addison-Wesley, 4th edition, 2011.
- Weiss, M. A. "Data Structures and Problem Solving Using Java." Addison-Wesley, 4th edition, 2010.