Introduction
Insertion sort is a simple and intuitive comparison-based sorting algorithm that builds a sorted portion of the array one element at a time. It works similarly to the way most people sort a hand of playing cards: you pick up one card at a time and insert it into its correct position among the cards you have already sorted.
The algorithm divides the input array into two conceptual regions: a sorted region on the left and an unsorted region on the right. Initially, the sorted region contains only the first element. The algorithm then repeatedly takes the next element from the unsorted region and inserts it into its correct position within the sorted region by shifting larger elements to the right.
Despite its O(n^2) worst-case time complexity, insertion sort is highly efficient for small datasets, nearly sorted data, and as the base case in more advanced hybrid sorting algorithms. It is one of the most practical quadratic-time sorting algorithms and is preferred over bubble sort and selection sort in most real-world scenarios.
"Insertion sort is a simple sorting algorithm that is appropriate for small inputs. Most people can readily understand insertion sort by analogy to sorting a hand of playing cards." -- Thomas H. Cormen, Introduction to Algorithms
How It Works
Basic Approach
Insertion sort processes the array from left to right. For each element, it compares the element with those in the sorted portion and shifts elements to the right to make room, then places the element in its correct position.
- Start with the second element (index 1). The first element is considered already sorted.
- Store the current element in a temporary variable called the "key."
- Compare the key with each element in the sorted portion, moving from right to left.
- Shift elements that are greater than the key one position to the right.
- Insert the key into the position where all elements to its left are smaller or equal.
- Repeat for all remaining elements in the unsorted portion.
Visual Intuition
Think of a row of books on a shelf that you want to arrange by height. You start from the second book and compare it to the first. If it is shorter, you slide the first book to the right and place the shorter one in its spot. Then you take the third book and slide it into the correct position among the first two, and so on. Each time you place a book, all the books to its left are already in order.
Algorithm and Pseudocode
procedure insertionSort(A : array of sortable items) n := length(A) for i := 1 to n - 1 do key := A[i] j := i - 1 // Shift elements of A[0..i-1] that are greater than key while j >= 0 and A[j] > key do A[j + 1] := A[j] j := j - 1 end while A[j + 1] := key end forend procedureThe outer loop runs from the second element to the last. The inner while loop shifts elements rightward until the correct insertion point for the key is found. Because it uses > rather than >=, equal elements preserve their original relative order, making the algorithm stable.
Time and Space Complexity
| Case | Time Complexity | Comparisons | Shifts |
|---|---|---|---|
| Best Case (already sorted) | O(n) | n - 1 | 0 |
| Average Case (random order) | O(n^2) | ~n^2 / 4 | ~n^2 / 4 |
| Worst Case (reverse sorted) | O(n^2) | n(n - 1) / 2 | n(n - 1) / 2 |
Space Complexity: O(1). Insertion sort is an in-place algorithm. It only requires a constant amount of additional memory for the key variable and loop counters, regardless of the input size.
Stability: Insertion sort is stable. Elements with equal keys remain in their original relative order because the shifting condition uses strict inequality.
Adaptivity: Insertion sort is adaptive. Its running time depends on the number of inversions in the input. An inversion is a pair (i, j) where i < j but A[i] > A[j]. If the number of inversions is k, the algorithm runs in O(n + k) time, making it very efficient for nearly sorted data.
Worked Example
Sort the array [7, 3, 5, 1, 9, 2] in ascending order using insertion sort.
| Pass | Key | Action | Array State |
|---|---|---|---|
| Initial | -- | Starting array | [7, 3, 5, 1, 9, 2] |
| 1 | 3 | 3 < 7, shift 7 right, insert 3 | [3, 7, 5, 1, 9, 2] |
| 2 | 5 | 5 < 7, shift 7 right; 5 > 3, insert 5 | [3, 5, 7, 1, 9, 2] |
| 3 | 1 | 1 < 7, shift 7; 1 < 5, shift 5; 1 < 3, shift 3; insert 1 | [1, 3, 5, 7, 9, 2] |
| 4 | 9 | 9 > 7, no shifts needed, insert 9 | [1, 3, 5, 7, 9, 2] |
| 5 | 2 | 2 < 9, shift 9; 2 < 7, shift 7; 2 < 5, shift 5; 2 < 3, shift 3; 2 > 1, insert 2 | [1, 2, 3, 5, 7, 9] |
The bold portion indicates the sorted region after each pass. Notice that pass 4 required zero shifts because 9 was already in the correct position relative to the sorted portion. This illustrates the adaptive nature of insertion sort.
Advantages and Disadvantages
Advantages
- Simple to implement: The algorithm is straightforward and easy to code correctly.
- Efficient for small datasets: The low overhead makes it faster than O(n log n) algorithms for arrays with fewer than 10-50 elements.
- Adaptive: Runs in nearly O(n) time on data that is already or nearly sorted.
- Stable: Preserves the relative order of equal elements.
- In-place: Requires only O(1) additional memory.
- Online: Can sort data as it is received, without needing the entire input upfront.
Disadvantages
- Quadratic worst case: O(n^2) time complexity makes it unsuitable for large, randomly ordered datasets.
- Excessive shifting: Inserting an element near the beginning of a large sorted portion requires shifting many elements.
- Not suitable for large datasets: Algorithms like merge sort and quicksort significantly outperform insertion sort as the input size grows.
Comparison with Other Algorithms
| Algorithm | Best | Average | Worst | Space | Stable |
|---|---|---|---|---|---|
| Insertion Sort | O(n) | O(n^2) | O(n^2) | O(1) | Yes |
| Bubble Sort | O(n) | O(n^2) | O(n^2) | O(1) | Yes |
| Selection Sort | O(n^2) | O(n^2) | O(n^2) | O(1) | No |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes |
| Quick Sort | O(n log n) | O(n log n) | O(n^2) | O(log n) | No |
| Heap Sort | O(n log n) | O(n log n) | O(n log n) | O(1) | No |
Insertion Sort vs. Bubble Sort: Both are O(n^2) and stable, but insertion sort performs roughly half as many comparisons and swaps on average. Insertion sort shifts elements rather than repeatedly swapping adjacent pairs, which is more efficient in practice.
Insertion Sort vs. Selection Sort: Selection sort always performs O(n^2) comparisons regardless of input order and is not stable. Insertion sort adapts to nearly sorted input and is stable, making it the preferred quadratic-time algorithm in most cases.
Insertion Sort vs. Merge Sort: Merge sort guarantees O(n log n) time but requires O(n) extra space. For small arrays, insertion sort's lower overhead often makes it faster, which is why many merge sort implementations switch to insertion sort for subarrays below a threshold (typically 10-50 elements).
Implementation Notes
Binary Insertion Sort
A variant that uses binary search to find the correct insertion point, reducing the number of comparisons to O(n log n) in total. However, the number of shifts remains O(n^2) in the worst case, so the overall time complexity does not improve. It can be beneficial when comparisons are expensive relative to data movement.
procedure binaryInsertionSort(A : array) for i := 1 to length(A) - 1 do key := A[i] // Use binary search to find insertion point pos := binarySearch(A, 0, i - 1, key) // Shift all elements from pos to i-1 one position right for j := i - 1 down to pos do A[j + 1] := A[j] end for A[pos] := key end forend procedureUse in Hybrid Algorithms
Insertion sort is commonly used as the base case in divide-and-conquer sorting algorithms. For example, Python's Timsort and Java's dual-pivot quicksort both fall back to insertion sort for small subarrays. The crossover point is typically determined empirically and is usually between 10 and 64 elements.
Linked List Variant
Insertion sort is particularly well suited for sorting linked lists, because insertion into a linked list is O(1) once the correct position is found. This eliminates the costly shifting step, though finding the position still requires traversal.
Applications
- Small datasets: Frequently used in practice for sorting arrays with fewer than 50 elements, where its low overhead outweighs its quadratic complexity.
- Nearly sorted data: Ideal for maintaining a sorted order when new data arrives incrementally, such as in real-time data feeds or streaming applications.
- Hybrid sorting algorithms: Used as the base case in Timsort (Python, Java), introsort (C++ STL), and other production-quality sorting implementations.
- Online sorting: Because it processes elements one at a time from left to right, it can sort data as it arrives without needing the entire input in advance.
- Embedded systems: Its minimal memory footprint and simple control flow make it suitable for resource-constrained environments.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. Introduction to Algorithms, 4th Edition. MIT Press, 2022. Chapter 2: Getting Started.
- Knuth, D. E. The Art of Computer Programming, Volume 3: Sorting and Searching, 2nd Edition. Addison-Wesley, 1998. Section 5.2.1: Sorting by Insertion.
- Sedgewick, R. and Wayne, K. Algorithms, 4th Edition. Addison-Wesley, 2011. Section 2.1: Elementary Sorts.
- Bentley, J. L. "Programming Pearls: A Literate Program." Communications of the ACM, 29(6), 1986.