Introduction

Sudoku is a logic-based number-placement puzzle played on a 9x9 grid divided into nine 3x3 sub-grids (called boxes or blocks). Some cells contain pre-filled digits (clues), and the goal is to fill the remaining cells so that each row, each column, and each 3x3 box contains all digits from 1 to 9 exactly once.

From a computational perspective, solving Sudoku is a constraint satisfaction problem (CSP). The puzzle defines a set of variables (empty cells), domains (digits 1-9), and constraints (no repeated digits in any row, column, or box). The challenge is to find an assignment of values to all variables that satisfies every constraint simultaneously.

Solving general Sudoku (on an n^2 x n^2 grid) is NP-complete, as proved by Yato and Seta in 2003. However, the standard 9x9 puzzle has a fixed, small size, and modern solvers can handle even the hardest instances in milliseconds by combining constraint propagation with backtracking search.

Problem Formulation

As a Constraint Satisfaction Problem

The Sudoku puzzle can be formalized as follows:

  • Variables: 81 cells, each denoted by its position (r, c) where r, c are in {0, 1, ..., 8}
  • Domains: Each variable has domain {1, 2, ..., 9}, reduced by the initial clues
  • Constraints: AllDifferent constraint on each row (9 constraints), each column (9 constraints), and each 3x3 box (9 constraints), for 27 constraints total

Peers and Units

Each cell has exactly 20 peers: 8 in its row, 8 in its column, and 4 additional cells in its 3x3 box that are not already counted. A cell's value must differ from all 20 of its peers.

Constraint TypeCountCells per Unit
Row constraints99
Column constraints99
Box constraints99
Total peers per cell--20

Backtracking Solver

The basic backtracking solver finds an empty cell, tries each digit from 1 to 9, checks validity, and recursively attempts to fill the remaining cells. If no digit works, it backtracks.

Pseudocode

procedure SolveSudoku(grid): cell = findEmptyCell(grid) if cell is null: return true // all cells filled row, col = cell for digit from 1 to 9: if isValid(grid, row, col, digit): grid[row][col] = digit if SolveSudoku(grid): return true grid[row][col] = 0 // backtrack return falsefunction isValid(grid, row, col, digit): // Check row for c from 0 to 8: if grid[row][c] == digit: return false // Check column for r from 0 to 8: if grid[r][col] == digit: return false // Check 3x3 box boxRow = (row / 3) * 3 boxCol = (col / 3) * 3 for r from boxRow to boxRow + 2: for c from boxCol to boxCol + 2: if grid[r][c] == digit: return false return true

Constraint Propagation

Constraint propagation reduces the search space before and during backtracking by deducing forced values. Two key strategies are:

Naked Singles

If a cell has only one candidate value remaining (its domain has been reduced to a single element by eliminating the values of all its peers), that value must be assigned to the cell. This assignment may in turn reduce the domains of the cell's peers, triggering a cascade of further deductions.

Hidden Singles

If a digit can appear in only one cell within a row, column, or box, it must be placed there, even if that cell has other candidates. For example, if digit 7 is a candidate in only one cell of row 3, then that cell must contain 7.

procedure Propagate(grid, candidates): changed = true while changed: changed = false // Naked singles for each cell (r, c) with |candidates[r][c]| == 1: digit = the single candidate assign digit to grid[r][c] remove digit from all peers' candidate lists changed = true // Hidden singles for each unit (row, column, or box): for each digit d from 1 to 9: cells = cells in unit where d is a candidate if |cells| == 1: assign d to that cell remove d from all peers' candidate lists changed = true if |cells| == 0: return CONTRADICTION

Peter Norvig demonstrated that combining constraint propagation with backtracking and the MRV (Minimum Remaining Values) heuristic solves every known 9x9 Sudoku puzzle in under a second, typically exploring fewer than 100 nodes in the search tree.

Worked Example

Consider a puzzle with the following partial grid (0 = empty):

5 3 0 | 0 7 0 | 0 0 06 0 0 | 1 9 5 | 0 0 00 9 8 | 0 0 0 | 0 6 0------+-------+------8 0 0 | 0 6 0 | 0 0 34 0 0 | 8 0 3 | 0 0 17 0 0 | 0 2 0 | 0 0 6------+-------+------0 6 0 | 0 0 0 | 2 8 00 0 0 | 4 1 9 | 0 0 50 0 0 | 0 8 0 | 0 7 9

Cell (0,2): Row 0 has {5,3,7}. Column 2 has {8}. Box has {5,3,6,9,8}. Candidates: {1,2,4}. After hidden single analysis in row 0, this cell must be 4.

Cell (0,3): After placing 4, candidates become {2,6}. Constraint propagation through the column and box narrows this to {6}.

Continuing propagation resolves most cells without backtracking. The few ambiguous cells require the solver to guess and backtrack if a contradiction arises.

Complexity Analysis

MethodWorst CaseTypical 9x9
Brute-force backtrackingO(9^(n^2))Millions of nodes
Backtracking + propagationO(9^(n^2))Under 100 nodes
Dancing Links (DLX)O(9^(n^2))Very fast in practice

The theoretical worst case remains exponential for all known methods, since Sudoku solving is NP-complete in general. However, for the fixed 9x9 size, the problem size is constant, and practical performance depends on the effectiveness of propagation on the specific puzzle instance.

Advanced Techniques

Minimum Remaining Values (MRV) Heuristic

When choosing which empty cell to fill next, select the cell with the fewest remaining candidates. This "fail-first" strategy detects dead ends early and reduces the branching factor of the search tree.

Dancing Links (Algorithm X)

Sudoku can be reformulated as an exact cover problem and solved using Knuth's Algorithm X with Dancing Links (DLX). The exact cover formulation creates a sparse binary matrix where each row represents a possible digit placement and each column represents a constraint. DLX uses doubly-linked circular lists to efficiently cover and uncover columns during backtracking.

Naked Pairs and Triples

If two cells in a unit share the same two candidates and no others, those two values can be eliminated from all other cells in the unit. Similarly for triples and quads. These techniques extend constraint propagation beyond simple singles.

X-Wing and Swordfish

These are pattern-based elimination techniques that identify configurations spanning multiple rows and columns. An X-Wing occurs when a candidate appears in exactly two cells in each of two rows, and those cells align in the same columns, allowing eliminations in those columns.

Implementation

def solve_sudoku(board): def find_empty(): min_count = 10 best = None for r in range(9): for c in range(9): if board[r][c] == 0: count = len(get_candidates(r, c)) if count < min_count: min_count = count best = (r, c) return best def get_candidates(r, c): used = set() for i in range(9): used.add(board[r][i]) used.add(board[i][c]) br, bc = 3 * (r // 3), 3 * (c // 3) for i in range(br, br + 3): for j in range(bc, bc + 3): used.add(board[i][j]) return [d for d in range(1, 10) if d not in used] cell = find_empty() if cell is None: return True r, c = cell for digit in get_candidates(r, c): board[r][c] = digit if solve_sudoku(board): return True board[r][c] = 0 return False

Puzzle Generation and Difficulty

Generating a valid Sudoku puzzle involves two steps: creating a fully solved grid, then removing clues while ensuring the puzzle has a unique solution.

Minimum Clues

McGuire, Tugemann, and Civario (2014) proved that no valid 9x9 Sudoku puzzle exists with fewer than 17 clues. This result required an exhaustive computational search and is one of the landmark results in recreational mathematics.

Difficulty Rating

Puzzle difficulty correlates not with the number of clues but with the complexity of the deduction techniques required. Easy puzzles can be solved with naked and hidden singles alone. Medium puzzles may require pairs and triples. Hard puzzles demand techniques like X-Wings, coloring, or trial-and-error (backtracking).

The number of essentially different completed 9x9 Sudoku grids (up to relabeling, rotation, reflection, and band permutations) is 5,472,730,538, as computed by Jarvis and Russell (2006).

References

  • Norvig, P. "Solving Every Sudoku Puzzle." 2006. Available at norvig.com/sudoku.html
  • Knuth, D. E. "Dancing Links." Millennial Perspectives in Computer Science, 2000, pp. 187-214.
  • Yato, T. and Seta, T. "Complexity and Completeness of Finding Another Solution and Its Application to Puzzles." IEICE Transactions, vol. 86-A, 2003, pp. 1052-1060.
  • McGuire, G., Tugemann, B., and Civario, G. "There Is No 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem via Hitting Set Enumeration." Experimental Mathematics, vol. 23, 2014, pp. 190-217.
  • Jarvis, F. and Russell, E. "There Are 5472730538 Essentially Different Sudoku Grids." 2006.
  • Simonis, H. "Sudoku as a Constraint Problem." Cork Constraint Computation Centre, 2005.