Introduction
Binary search is one of the most fundamental and efficient searching algorithms in computer science. It operates on sorted arrays by repeatedly dividing the search interval in half. At each step, the algorithm compares the target value with the middle element of the current interval. If the target matches the middle element, the search is complete. If the target is less than the middle element, the search continues in the lower half; if greater, it continues in the upper half. This halving process continues until the target is found or the interval is empty.
The key insight behind binary search is that each comparison eliminates half of the remaining candidates, resulting in a logarithmic time complexity of O(log n). This makes binary search dramatically faster than linear search for large datasets. For example, searching through one million sorted elements requires at most 20 comparisons with binary search, compared to up to one million comparisons with linear search.
"Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky, and many good programmers have done it wrong the first few times they tried." -- Jon Bentley, Programming Pearls
Binary search was first described by John Mauchly in 1946, but the first bug-free published implementation did not appear until 1962. Even in 2006, a subtle integer overflow bug was discovered in widely used implementations including the Java standard library. These facts underscore the importance of understanding the algorithm thoroughly despite its apparent simplicity.
How It Works
Prerequisites
Binary search requires the input array to be sorted in non-decreasing order. If the array is not sorted, the algorithm will produce incorrect results. The data structure must also support random access (constant-time indexing), which is why binary search works on arrays but not directly on linked lists.
Step-by-Step Procedure
- Set two pointers: low = 0 and high = n - 1, where n is the array length.
- While low is less than or equal to high:
- Compute the middle index: mid = low + (high - low) / 2.
- If A[mid] equals the target, return mid.
- If A[mid] is less than the target, set low = mid + 1 (search the right half).
- If A[mid] is greater than the target, set high = mid - 1 (search the left half).
- If the loop ends without finding the target, return -1 (not found).
Why mid = low + (high - low) / 2?
The naive formula mid = (low + high) / 2 can cause integer overflow when low and high are large. The expression low + (high - low) / 2 produces the same result without overflow risk. This subtle bug went undetected in many production implementations for decades.
Algorithm and Pseudocode
Iterative Binary Search
procedure BinarySearch(A, n, target) Input: A = sorted array of n elements, target = value to find Output: index of target in A, or -1 if not found low := 0 high := n - 1 while low <= high do mid := low + (high - low) / 2 if A[mid] = target then return mid else if A[mid] < target then low := mid + 1 else high := mid - 1 end if end while return -1end procedureRecursive Binary Search
procedure BinarySearchRecursive(A, low, high, target) Input: A = sorted array, low and high = current search bounds, target = value to find Output: index of target in A, or -1 if not found if low > high then return -1 end if mid := low + (high - low) / 2 if A[mid] = target then return mid else if A[mid] < target then return BinarySearchRecursive(A, mid + 1, high, target) else return BinarySearchRecursive(A, low, mid - 1, target) end ifend procedureIterative vs. Recursive
Both approaches have identical time complexity of O(log n). The iterative version uses O(1) auxiliary space, while the recursive version uses O(log n) stack space due to recursive calls. In practice, the iterative version is generally preferred for its lower space overhead and avoidance of potential stack overflow on very large arrays.
Complexity Analysis
Time Complexity
| Case | Complexity | Explanation |
|---|---|---|
| Best Case | O(1) | Target is at the middle of the array on the first comparison. |
| Average Case | O(log n) | Each comparison halves the search space; approximately log2(n) comparisons needed. |
| Worst Case | O(log n) | Target is not present or at the boundary; the search space is halved until empty. |
Space Complexity
| Implementation | Space Complexity | Reason |
|---|---|---|
| Iterative | O(1) | Uses only a fixed number of variables (low, high, mid). |
| Recursive | O(log n) | Each recursive call adds a frame to the call stack; maximum depth is log2(n). |
Number of Comparisons
The maximum number of comparisons binary search performs is floor(log2(n)) + 1. For practical reference:
- n = 1,000: at most 10 comparisons
- n = 1,000,000: at most 20 comparisons
- n = 1,000,000,000: at most 30 comparisons
This logarithmic growth is what makes binary search so powerful for large datasets.
Worked Example
Search for target = 23 in the sorted array A = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91].
Step-by-Step Trace
| Step | low | high | mid | A[mid] | Comparison | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 9 |