1. Introduction

When multiplying a chain of matrices, the order in which we perform the multiplications can dramatically affect the total number of scalar operations. Matrix multiplication is associative -- the result is the same regardless of how we parenthesize the product -- but the computational cost can vary by orders of magnitude depending on the grouping.

"The matrix-chain multiplication problem is perhaps the most well-known example of a dynamic programming problem that is not immediately obvious how to decompose into subproblems." -- Thomas H. Cormen, Introduction to Algorithms

The Matrix Chain Multiplication (MCM) problem asks: given a sequence of matrices, find the parenthesization that minimizes the total number of scalar multiplications.

2. Problem Definition

Input: A sequence of n matrices A1, A2, ..., An where matrix Ai has dimensions p[i-1] x p[i]. The dimensions are given as an array p[0..n].

Output: The minimum number of scalar multiplications needed to compute the product A1 * A2 * ... * An, along with the optimal parenthesization.

Matrix multiplication cost: Multiplying a p x q matrix by a q x r matrix requires p * q * r scalar multiplications and produces a p x r matrix.

MatrixDimensions
A110 x 30
A230 x 5
A35 x 60

Dimensions array: p = [10, 30, 5, 60]

3. Why Parenthesization Matters

Consider the three matrices above. There are two possible parenthesizations:

ParenthesizationComputationScalar Multiplications
(A1 * A2) * A3First: 10*30*5 = 1500; Then: 10*5*60 = 30004,500
A1 * (A2 * A3)First: 30*5*60 = 9000; Then: 10*30*60 = 1800027,000

6x difference! The first parenthesization requires only 4,500 multiplications while the second requires 27,000. For longer chains with more matrices, the difference can be exponentially larger.

4. Brute Force Approach

The brute force approach tries all possible parenthesizations. The number of ways to parenthesize a product of n matrices is the (n-1)th Catalan number, which grows as O(4^n / n^(3/2)).

functionmcmBrute(p, i, j): if i == j: return0// Single matrix, no multiplication needed minCost = infinity for k = i to j-1: cost = mcmBrute(p, i, k) // Cost of left subchain + mcmBrute(p, k+1, j) // Cost of right subchain + p[i-1] * p[k] * p[j] // Cost of final multiplication minCost = min(minCost, cost) return minCost

For n matrices, we split at position k (where i <= k < j), recursively solve the left and right subchains, and add the cost of multiplying the two resulting matrices.

5. Optimal Substructure

If the optimal solution splits the chain Ai..Aj at position k, then:

  • The left subchain Ai..Ak must be parenthesized optimally.
  • The right subchain Ak+1..Aj must be parenthesized optimally.

Any suboptimal parenthesization of either subchain would increase the total cost, contradicting the assumption of overall optimality. This is the optimal substructure property.

Overlapping subproblems arise because different splits at higher levels can lead to the same subchain being evaluated. For example, when computing the optimal cost for A1..A5, the subproblem A2..A4 might be needed whether we split at k=1 or reconstruct it from another path.

6. Recurrence Relation

Let m[i][j] represent the minimum number of scalar multiplications needed to compute Ai * Ai+1 * ... * Aj.

Recurrence:

m[i][j] = min over i <= k < j of { m[i][k] + m[k+1][j] + p[i-1] * p[k] * p[j] }

Base case:m[i][i] = 0 (a single matrix requires no multiplication)

We also maintain a table s[i][j] that records the split point k that achieved the minimum, which we use later to reconstruct the optimal parenthesization.

7. Bottom-Up (Tabulation)

We fill the table by increasing chain length L, from length 2 up to length n.

functionmcmDP(p, n): m = array[n+1][n+1], initialized to 0 s = array[n+1][n+1], initialized to 0// L is the chain lengthfor L = 2to n: for i = 1to n - L + 1: j = i + L - 1 m[i][j] = infinity for k = i to j - 1: cost = m[i][k] + m[k+1][j] + p[i-1] * p[k] * p[j] if cost < m[i][j]: m[i][j] = cost s[i][j] = k return m[1][n], s

8. Worked Example

Dimensions: p = [30, 35, 15, 5, 10, 20, 25], representing 6 matrices:

  • A1: 30x35, A2: 35x15, A3: 15x5, A4: 5x10, A5: 10x20, A6: 20x25

Chain Length L = 2

SubproblemSplit kCost = p[i-1]*p[k]*p[j]m[i][j]
m[1][2]130*35*15 = 15,75015,750
m[2][3]235*15*5 = 2,6252,625
m[3][4]315*5*10 = 750750
m[4][5]45*10*20 = 1,0001,000
m[5][6]510*20*25 = 5,0005,000

Chain Length L = 3 (selected entries)

Subproblemk=ik=i+1m[i][j]Best k
m[1][3]m[1][1]+m[2][3]+30*35*5 = 0+2625+5250 = 7,875m[1][2]+m[3][3]+30*15*5 = 15750+0+2250 = 18,0007,8751
m[2][4]m[2][2]+m[3][4]+35*15*10 = 0+750+5250 = 6,000m[2][3]+m[4][4]+35*5*10 = 2625+0+1750 = 4,3754,3753

Final Cost Table m[i][j]

123456
1015,7507,8759,37511,87515,125
202,6254,3757,12510,500
307502,5005,375
401,0003,500
505,000
60

The minimum cost is m[1][6] = 15,125 scalar multiplications.

9. Reconstructing the Parenthesization

The split table s[i][j] records the optimal split point for each subproblem.

functionprintParens(s, i, j): if i == j: print("A" + i) else: print("(") printParens(s, i, s[i][j]) printParens(s, s[i][j] + 1, j) print(")")

Split Table s[i][j]

23456
111333
22333
3333
445
55

Following the split table from s[1][6] = 3: the optimal parenthesization is ((A1(A2 A3))((A4 A5)A6)), which achieves the minimum cost of 15,125 scalar multiplications.

10. Complexity Analysis

ApproachTime ComplexitySpace Complexity
Brute Force (Catalan)O(4^n / n^(3/2))O(n)
Top-Down (Memoization)O(n^3)O(n^2)
Bottom-Up (Tabulation)O(n^3)O(n^2)

The DP solution has three nested loops: the chain length L (O(n)), the starting position i (O(n)), and the split point k (O(n)), giving O(n^3) time. The two tables m and s each require O(n^2) space. Hu and Shing's algorithm can solve this in O(n log n), but the cubic DP solution is standard.

11. Variations

Optimal Binary Search Tree

The optimal BST problem has a very similar structure: given keys with known access frequencies, find the tree structure that minimizes total search cost. The recurrence splits at a root rather than a multiplication boundary.

Polygon Triangulation

Dividing a convex polygon into triangles to minimize total cost is structurally identical to MCM. Each triangulation corresponds to a parenthesization of the "chain" of polygon edges.

Parallel Matrix Multiplication

When matrices can be multiplied in parallel, the objective changes from minimizing total operations to minimizing the critical path length (parallel time).

12. Applications

  • Scientific computing: Large-scale linear algebra computations in physics simulations and machine learning benefit from optimal matrix multiplication ordering.
  • Database query optimization: Joining multiple tables is analogous to matrix chain multiplication, where the "dimensions" correspond to table sizes and the "cost" is the number of row comparisons.
  • Computer graphics: Transformation matrix chains in 3D rendering pipelines can be optimized using MCM principles.
  • Compiler optimization: Expression evaluation order optimization for matrix expressions in numerical computing libraries.
  • Parsing: The CYK algorithm for context-free grammar parsing has a structure similar to MCM.

13. References

  1. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. Introduction to Algorithms (4th ed.). MIT Press, 2022. Chapter 14: Dynamic Programming.
  2. Hu, T. C. and Shing, M. T. "Computation of matrix chain products. Part I." SIAM Journal on Computing, 11(2):362-373, 1982.
  3. Godbole, S. S. "On efficient computation of matrix chain products." IEEE Transactions on Computers, C-22(9):864-866, 1973.
  4. Kleinberg, J. and Tardos, E. Algorithm Design. Pearson, 2005.