1. Introduction
When multiplying a chain of matrices, the order in which we perform the multiplications can dramatically affect the total number of scalar operations. Matrix multiplication is associative -- the result is the same regardless of how we parenthesize the product -- but the computational cost can vary by orders of magnitude depending on the grouping.
"The matrix-chain multiplication problem is perhaps the most well-known example of a dynamic programming problem that is not immediately obvious how to decompose into subproblems." -- Thomas H. Cormen, Introduction to Algorithms
The Matrix Chain Multiplication (MCM) problem asks: given a sequence of matrices, find the parenthesization that minimizes the total number of scalar multiplications.
2. Problem Definition
Input: A sequence of n matrices A1, A2, ..., An where matrix Ai has dimensions p[i-1] x p[i]. The dimensions are given as an array p[0..n].
Output: The minimum number of scalar multiplications needed to compute the product A1 * A2 * ... * An, along with the optimal parenthesization.
Matrix multiplication cost: Multiplying a p x q matrix by a q x r matrix requires p * q * r scalar multiplications and produces a p x r matrix.
| Matrix | Dimensions |
|---|---|
| A1 | 10 x 30 |
| A2 | 30 x 5 |
| A3 | 5 x 60 |
Dimensions array: p = [10, 30, 5, 60]
3. Why Parenthesization Matters
Consider the three matrices above. There are two possible parenthesizations:
| Parenthesization | Computation | Scalar Multiplications |
|---|---|---|
| (A1 * A2) * A3 | First: 10*30*5 = 1500; Then: 10*5*60 = 3000 | 4,500 |
| A1 * (A2 * A3) | First: 30*5*60 = 9000; Then: 10*30*60 = 18000 | 27,000 |
6x difference! The first parenthesization requires only 4,500 multiplications while the second requires 27,000. For longer chains with more matrices, the difference can be exponentially larger.
4. Brute Force Approach
The brute force approach tries all possible parenthesizations. The number of ways to parenthesize a product of n matrices is the (n-1)th Catalan number, which grows as O(4^n / n^(3/2)).
functionmcmBrute(p, i, j): if i == j: return0// Single matrix, no multiplication needed minCost = infinity for k = i to j-1: cost = mcmBrute(p, i, k) // Cost of left subchain + mcmBrute(p, k+1, j) // Cost of right subchain + p[i-1] * p[k] * p[j] // Cost of final multiplication minCost = min(minCost, cost) return minCostFor n matrices, we split at position k (where i <= k < j), recursively solve the left and right subchains, and add the cost of multiplying the two resulting matrices.
5. Optimal Substructure
If the optimal solution splits the chain Ai..Aj at position k, then:
- The left subchain
Ai..Akmust be parenthesized optimally. - The right subchain
Ak+1..Ajmust be parenthesized optimally.
Any suboptimal parenthesization of either subchain would increase the total cost, contradicting the assumption of overall optimality. This is the optimal substructure property.
Overlapping subproblems arise because different splits at higher levels can lead to the same subchain being evaluated. For example, when computing the optimal cost for A1..A5, the subproblem A2..A4 might be needed whether we split at k=1 or reconstruct it from another path.
6. Recurrence Relation
Let m[i][j] represent the minimum number of scalar multiplications needed to compute Ai * Ai+1 * ... * Aj.
Recurrence:
m[i][j] = min over i <= k < j of { m[i][k] + m[k+1][j] + p[i-1] * p[k] * p[j] }
Base case:m[i][i] = 0 (a single matrix requires no multiplication)
We also maintain a table s[i][j] that records the split point k that achieved the minimum, which we use later to reconstruct the optimal parenthesization.
7. Bottom-Up (Tabulation)
We fill the table by increasing chain length L, from length 2 up to length n.
functionmcmDP(p, n): m = array[n+1][n+1], initialized to 0 s = array[n+1][n+1], initialized to 0// L is the chain lengthfor L = 2to n: for i = 1to n - L + 1: j = i + L - 1 m[i][j] = infinity for k = i to j - 1: cost = m[i][k] + m[k+1][j] + p[i-1] * p[k] * p[j] if cost < m[i][j]: m[i][j] = cost s[i][j] = k return m[1][n], s8. Worked Example
Dimensions: p = [30, 35, 15, 5, 10, 20, 25], representing 6 matrices:
- A1: 30x35, A2: 35x15, A3: 15x5, A4: 5x10, A5: 10x20, A6: 20x25
Chain Length L = 2
| Subproblem | Split k | Cost = p[i-1]*p[k]*p[j] | m[i][j] |
|---|---|---|---|
| m[1][2] | 1 | 30*35*15 = 15,750 | 15,750 |
| m[2][3] | 2 | 35*15*5 = 2,625 | 2,625 |
| m[3][4] | 3 | 15*5*10 = 750 | 750 |
| m[4][5] | 4 | 5*10*20 = 1,000 | 1,000 |
| m[5][6] | 5 | 10*20*25 = 5,000 | 5,000 |
Chain Length L = 3 (selected entries)
| Subproblem | k=i | k=i+1 | m[i][j] | Best k |
|---|---|---|---|---|
| m[1][3] | m[1][1]+m[2][3]+30*35*5 = 0+2625+5250 = 7,875 | m[1][2]+m[3][3]+30*15*5 = 15750+0+2250 = 18,000 | 7,875 | 1 |
| m[2][4] | m[2][2]+m[3][4]+35*15*10 = 0+750+5250 = 6,000 | m[2][3]+m[4][4]+35*5*10 = 2625+0+1750 = 4,375 | 4,375 | 3 |
Final Cost Table m[i][j]
| 1 | 2 | 3 | 4 | 5 | 6 | |
|---|---|---|---|---|---|---|
| 1 | 0 | 15,750 | 7,875 | 9,375 | 11,875 | 15,125 |
| 2 | 0 | 2,625 | 4,375 | 7,125 | 10,500 | |
| 3 | 0 | 750 | 2,500 | 5,375 | ||
| 4 | 0 | 1,000 | 3,500 | |||
| 5 | 0 | 5,000 | ||||
| 6 | 0 |
The minimum cost is m[1][6] = 15,125 scalar multiplications.
9. Reconstructing the Parenthesization
The split table s[i][j] records the optimal split point for each subproblem.
functionprintParens(s, i, j): if i == j: print("A" + i) else: print("(") printParens(s, i, s[i][j]) printParens(s, s[i][j] + 1, j) print(")")Split Table s[i][j]
| 2 | 3 | 4 | 5 | 6 | |
|---|---|---|---|---|---|
| 1 | 1 | 1 | 3 | 3 | 3 |
| 2 | 2 | 3 | 3 | 3 | |
| 3 | 3 | 3 | 3 | ||
| 4 | 4 | 5 | |||
| 5 | 5 |
Following the split table from s[1][6] = 3: the optimal parenthesization is ((A1(A2 A3))((A4 A5)A6)), which achieves the minimum cost of 15,125 scalar multiplications.
10. Complexity Analysis
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Brute Force (Catalan) | O(4^n / n^(3/2)) | O(n) |
| Top-Down (Memoization) | O(n^3) | O(n^2) |
| Bottom-Up (Tabulation) | O(n^3) | O(n^2) |
The DP solution has three nested loops: the chain length L (O(n)), the starting position i (O(n)), and the split point k (O(n)), giving O(n^3) time. The two tables m and s each require O(n^2) space. Hu and Shing's algorithm can solve this in O(n log n), but the cubic DP solution is standard.
11. Variations
Optimal Binary Search Tree
The optimal BST problem has a very similar structure: given keys with known access frequencies, find the tree structure that minimizes total search cost. The recurrence splits at a root rather than a multiplication boundary.
Polygon Triangulation
Dividing a convex polygon into triangles to minimize total cost is structurally identical to MCM. Each triangulation corresponds to a parenthesization of the "chain" of polygon edges.
Parallel Matrix Multiplication
When matrices can be multiplied in parallel, the objective changes from minimizing total operations to minimizing the critical path length (parallel time).
12. Applications
- Scientific computing: Large-scale linear algebra computations in physics simulations and machine learning benefit from optimal matrix multiplication ordering.
- Database query optimization: Joining multiple tables is analogous to matrix chain multiplication, where the "dimensions" correspond to table sizes and the "cost" is the number of row comparisons.
- Computer graphics: Transformation matrix chains in 3D rendering pipelines can be optimized using MCM principles.
- Compiler optimization: Expression evaluation order optimization for matrix expressions in numerical computing libraries.
- Parsing: The CYK algorithm for context-free grammar parsing has a structure similar to MCM.
13. References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. Introduction to Algorithms (4th ed.). MIT Press, 2022. Chapter 14: Dynamic Programming.
- Hu, T. C. and Shing, M. T. "Computation of matrix chain products. Part I." SIAM Journal on Computing, 11(2):362-373, 1982.
- Godbole, S. S. "On efficient computation of matrix chain products." IEEE Transactions on Computers, C-22(9):864-866, 1973.
- Kleinberg, J. and Tardos, E. Algorithm Design. Pearson, 2005.