Introduction
Informed search: algorithms using domain knowledge (heuristics) to guide search toward goal. Contrast uninformed: breadth-first, depth-first search blindly explore. Heuristics estimate cost from current node to goal. Prioritize promising nodes, ignore unpromising. Enables efficient search in huge spaces.
Core idea: use problem-specific knowledge to prune search. Instead exploring all paths, focus on likely paths to goal. Dramatic speedup possible: exponential spaces become tractable with good heuristics.
Key algorithms: Greedy best-first (fast but not optimal), A* (optimal with admissible heuristic), IDA* (memory-bounded). Heuristic quality crucial: good heuristic = efficient search, bad = poor performance.
"Informed search harnesses domain knowledge to transform intractable search spaces into solvable problems. Well-designed heuristics enable optimal solutions to problems exponential in uninformed search." -- Heuristic search methods
Informed vs. Uninformed Search
Uninformed Search
Uninformed (blind): no problem-specific knowledge. Only graph structure guides search (neighbor relationships). BFS: explores level-by-level. DFS: explores depth-first. Guarantees: BFS optimal (shortest path), both complete, but exponential time/space.
Informed Search
Informed (heuristic): uses heuristic evaluating promise of nodes. Heuristic estimates cost to goal. Prioritize high-promise nodes. Empirically much faster. But needs good heuristic, may not find optimal solution if heuristic misleading.
Efficiency Gains
Uninformed: search 1 million nodes. Informed with good heuristic: search 1,000 nodes. 1000x speedup possible. Difference between solving instantly vs. computationally infeasible.
Knowledge Requirements
Informed requires understanding problem domain. What makes solution good? How estimate distance to goal? Uninformed generic, works any problem but slowly.
Trade-Offs
Uninformed: guaranteed optimal, complete. Slow. Informed: often suboptimal, incomplete (may not find solution). Much faster. Choose based on requirements.
Heuristic Functions
Heuristic Definition
Heuristic function h(n): estimates cost from node n to goal. h(goal) = 0. h(n) >= 0 for all n. Domain-specific: depends on problem. Not exact, approximation.
Example Heuristics
Traveling Salesman: h(n) = straight-line distance to unvisited closest city. Route planning: h(n) = straight-line distance to destination. Puzzle: h(n) = number misplaced tiles (Manhattan distance). Water jug: h(n) = difference from goal state.
Design Principles
Informed: heuristic should lower bound (underestimate) true cost. Relevant: heuristic captures aspect problem difficulty. Efficient: compute quickly (else no speedup). Selective: distinguish promising from unpromising nodes.
Relaxed Problems
Derive heuristics from relaxed versions. Remove constraints, problem becomes easier, optimal solution is heuristic. Example: 8-puzzle with relaxed movement (slide any tile anywhere) gives heuristic.
Pattern Databases
Pre-compute optimal solutions for subproblems. Use as heuristic for full problem. Example: optimal solution for subset of puzzle tiles cached, queried as heuristic.
Learning Heuristics
Machine learning: train neural networks to predict cost to goal. Example: learn to estimate remaining distance in game tree. Learns heuristic from experience.
Admissible Heuristics
Admissibility Definition
Heuristic h admissible if never overestimates true cost. h(n) <= actual cost from n to goal. Formally: h(n) <= h*(n) where h* true cost.
Importance
Admissible heuristics guarantee optimal solutions (with A*). Non-admissible: may find suboptimal solutions. Trade-off: admissible often weaker (less informative), non-admissible stronger but risky.
Examples
Straight-line distance: underestimates actual distance (straight-line is shortest). Misplaced tiles: underestimates moves needed (ignore constraints). Admissible.
If heuristic overestimates: biased. Route planning heuristic assuming "as crow flies" through obstacles: overestimates actual time. May find suboptimal routes.
Consistency (Monotonicity)
Consistent: h(n) <= cost(n,n') + h(n') for all neighbors n'. Stronger property: consistent implies admissible. Consistency prevents revising node costs (ensures optimal path properties).
Relationship to Dominance
h1 dominates h2 if h1(n) >= h2(n) for all n (h1 tighter bound). Dominance guarantees h1 more efficient than h2. Search with dominant heuristic expands fewer nodes.
Greedy Best-First Search
Algorithm
def greedy_best_first_search(start, goal):
open_list = [start]
closed_list = []
while open_list not empty:
node = remove minimum h(n) from open_list
if node == goal: return path_to(node)
closed_list.add(node)
for neighbor in neighbors(node):
if neighbor not in closed_list:
open_list.add(neighbor)
return failure
How It Works
Maintain open list (frontier) ordered by h(n). Always expand node with lowest h(n) (closest to goal by estimate). Greedy: commit to locally best choice.
Efficiency
Very efficient: low space/time if heuristic good. Expands far fewer nodes than uninformed search.
Optimality
Not optimal: may find suboptimal paths. Greedy focus goal, might miss shorter paths. Example: straight-line distance heuristic finds wrong path if obstacles.
Completeness
Complete in finite spaces (finds solution if exists), incomplete in infinite graphs. May explore dead-end branches if heuristic misleads.
Advantages and Disadvantages
Advantage: very fast, memory-efficient. Disadvantage: suboptimal, can be misled by bad heuristic. Use when speed more important than optimality.
A* Algorithm
Algorithm Definition
f(n) = g(n) + h(n)
where:
g(n) = actual cost from start to n
h(n) = estimated cost from n to goal
f(n) = estimated total cost via n
A* expands node with lowest f(n)
How A* Works
Combines: g(n) cost incurred (actual), h(n) estimated remaining. f(n) balances exploring closer nodes (low g) vs. promising nodes (low h). Expand lowest f until goal.
Pseudocode
def a_star(start, goal):
open_list = PriorityQueue()
open_list.add(start, f=0)
closed_list = {}
while open_list not empty:
current = open_list.pop() # lowest f
if current == goal: return reconstruct_path(current)
closed_list[current] = g(current)
for neighbor in neighbors(current):
g_neighbor = g(current) + cost(current, neighbor)
if neighbor not in closed_list or g_neighbor < closed_list[neighbor]:
f_neighbor = g_neighbor + h(neighbor)
open_list.add(neighbor, f_neighbor)
return failure
Optimality Proof
A* optimal if h admissible. Proof: when goal found, it's lowest f in queue. Since h(goal) = 0, f(goal) = g(goal) = true cost. Any other path has higher f, thus higher cost. First goal found is optimal.
Completeness
A* complete in finite spaces (with consistent heuristic). Explores all nodes if necessary. May run forever in infinite graphs, but finds goal if exists in finite time.
Efficiency
Expands optimally: given heuristic class, A* expands fewest nodes (no other optimal algorithm with same heuristic expands fewer). Efficient if h good. Poor if h weak (becomes BFS-like).
Optimality and Completeness Proofs
Admissibility → Optimality
Theorem: A* with admissible h finds optimal solution. Proof sketch: first solution found has lowest f (by definition). h admissible means f = g + h <= g + h* = true cost. No other path (lower cost) has lower f.
Consistency → Optimality
Stronger: if h consistent, A* finds optimal solution and never reopens node (revises cost upward). Consistency implies admissible. But not all admissible are consistent.
Completeness with Consistency
Consistency ensures complete search: always explore promising paths, eventually reaching goal. Non-consistent: may exhaust open list without finding goal (infinite loops possible).
Counterexample: Non-Admissible
Non-admissible h may overestimate. First goal found might not be optimal. Example: h overestimating by 2 units finds path cost 10, optimal path exists cost 8.
Trade-Off with Efficiency
Admissible heuristics may expand more nodes (weaker estimates). Non-admissible may expand fewer but find suboptimal. Practical choice: accept slight suboptimality for significant efficiency gain.
Iterative Deepening A* (IDA*)
Algorithm Idea
IDA*: combines iterative deepening with A*. Iteratively increase threshold f_limit. Each iteration: depth-first search, only expand nodes with f(n) <= f_limit. Increase threshold, repeat.
Pseudocode
def ida_star(start, goal):
threshold = h(start)
while True:
result, new_threshold = search(start, 0, threshold, goal)
if result == goal: return reconstruct_path(result)
if new_threshold == infinity: return failure
threshold = new_threshold
def search(node, g, threshold, goal):
f = g + h(node)
if f > threshold: return (None, f)
if node == goal: return (node, infinity)
for neighbor in neighbors(node):
result, new_threshold = search(neighbor, g + cost, threshold, goal)
if result == goal: return (goal, infinity)
new_threshold = min(new_threshold, new_threshold)
return (None, new_threshold)
Memory Efficiency
IDA* uses O(d) memory where d = search depth. Contrast A*: O(N) where N = nodes explored. For large graphs, IDA* much better memory.
Time Complexity
Each iteration repeats work of previous (revisits nodes). Total time O(b^d) where b branching factor. Contrast A* O(b^d) as well (if good heuristic). Similar time, much less memory.
Advantages and Disadvantages
Advantage: memory-bounded, optimal. Disadvantage: redoes work across iterations, slower in practice than A* if memory permits. Use IDA* for space-constrained systems.
Variants
Depth-first A* (DFA*): similar iterative deepening on depth rather than f. Memory-Bounded A* (MBA*): limit open list size, drop least promising nodes.
Memory-Bounded Search
Problem with A*
A* stores all frontier nodes in memory. Large graphs: memory can exceed available. Large search space = huge open list = memory exhausted.
IDA* Solution
IDA* uses depth-first iteration: O(d) memory. Trade-off: recompute nodes across iterations. Effective for problems where time available more than memory.
Simplified Memory-Bounded A* (SMA*)
Like A* but limit memory. When memory full: drop node with highest f (least promising). Track forgotten nodes' costs. If reach node again, better paths possible. Optimizes use of limited memory.
Memory-Bounded Strategies
Breadth-first then backtrack: explore front level, backtrack when memory full. Prioritize based on promise: keep best nodes, discard worst. Cache: store intermediate results, reload as needed.
Practical Considerations
System memory: determine memory budget. Heuristic quality: good h more important than memory savings (fewer nodes to store). Problem size: memory constraints critical for huge problems.
Designing Effective Heuristics
Problem Relaxation
Remove constraints, problem becomes easier. Optimal solution of relaxed problem lower bound on original. Example: 8-puzzle, allow sliding tiles anywhere (not just adjacent) gives heuristic.
Subproblem Solutions
Solve subsets of problem optimally, cache solutions. Use as heuristic. Example: pattern databases precompute optimal solutions for goal subsets, query for heuristic value.
Multiple Heuristics
Compute multiple heuristics, take maximum (still admissible, stronger). Example: both Manhattan distance and misplaced tiles for 8-puzzle, use max.
Decomposable Heuristics
Decompose problem: h(n) = sum of heuristic values for independent subproblems. Admissible if subproblems non-overlapping. Enables modular heuristic design.
Learning from Experience
Solve instances, store solutions. Use patterns to predict heuristic. Machine learning: train neural network, learns heuristic from data. Generalizes to new instances.
Testing Heuristics
Empirical evaluation: run A* with candidate heuristics, measure nodes expanded. Better heuristic = fewer nodes. Compare search efficiency.
Applications in AI Planning and Games
Pathfinding in Games
Video game AI: A* finds shortest paths. Heuristic: Euclidean/Manhattan distance to goal. Efficient pathfinding for NPCs. Standard in game engines.
Puzzle Solving
8-puzzle, 15-puzzle: A* solves optimally. Heuristic: Manhattan distance (sum distances each tile from goal). Solves in practical time.
Robot Motion Planning
Plan collision-free paths. Configuration space: each point = robot position/orientation. A* searches configuration space. Heuristic: straight-line distance (ignoring obstacles).
AI Planning
Plan action sequences. State space: problem states. Actions: transitions. A* finds shortest plan. Heuristic: relaxed problem (fewer constraints), optimal plan relaxed problem.
Game Tree Search
Chess, games: alpha-beta pruning uses heuristic evaluation. Heuristic: estimate position value. Pruning: ignore clearly bad moves.
Knowledge Base Query
Find most likely explanation for observations. Search hypothesis space. Heuristic: agreement with known facts. A* guides search toward consistent hypotheses.
References
- Hart, P. E., Nilsson, N. J., and Raphael, B. "A Formal Basis for the Heuristic Determination of Minimum Cost Paths." IEEE Transactions on Systems Science and Cybernetics, vol. 4, no. 2, 1968, pp. 100-107.
- Russell, S. J., and Norvig, P. "Artificial Intelligence: A Modern Approach." Prentice Hall, 4th edition, 2020.
- Korf, R. E. "Depth-First Iterative-Deepening: An Optimal Admissible Tree Search." Artificial Intelligence, vol. 27, 1985, pp. 97-109.
- Korf, R. E. "Linear-Space Best-First Search: Summary of Results." AAAI, 1992.
- Pohl, I. "Heuristic Search Viewed as Path Finding in Graphs." Artificial Intelligence, vol. 1, 1970, pp. 193-204.