Introduction

Informed search: algorithms using domain knowledge (heuristics) to guide search toward goal. Contrast uninformed: breadth-first, depth-first search blindly explore. Heuristics estimate cost from current node to goal. Prioritize promising nodes, ignore unpromising. Enables efficient search in huge spaces.

Core idea: use problem-specific knowledge to prune search. Instead exploring all paths, focus on likely paths to goal. Dramatic speedup possible: exponential spaces become tractable with good heuristics.

Key algorithms: Greedy best-first (fast but not optimal), A* (optimal with admissible heuristic), IDA* (memory-bounded). Heuristic quality crucial: good heuristic = efficient search, bad = poor performance.

"Informed search harnesses domain knowledge to transform intractable search spaces into solvable problems. Well-designed heuristics enable optimal solutions to problems exponential in uninformed search." -- Heuristic search methods

Informed vs. Uninformed Search

Uninformed Search

Uninformed (blind): no problem-specific knowledge. Only graph structure guides search (neighbor relationships). BFS: explores level-by-level. DFS: explores depth-first. Guarantees: BFS optimal (shortest path), both complete, but exponential time/space.

Informed Search

Informed (heuristic): uses heuristic evaluating promise of nodes. Heuristic estimates cost to goal. Prioritize high-promise nodes. Empirically much faster. But needs good heuristic, may not find optimal solution if heuristic misleading.

Efficiency Gains

Uninformed: search 1 million nodes. Informed with good heuristic: search 1,000 nodes. 1000x speedup possible. Difference between solving instantly vs. computationally infeasible.

Knowledge Requirements

Informed requires understanding problem domain. What makes solution good? How estimate distance to goal? Uninformed generic, works any problem but slowly.

Trade-Offs

Uninformed: guaranteed optimal, complete. Slow. Informed: often suboptimal, incomplete (may not find solution). Much faster. Choose based on requirements.

Heuristic Functions

Heuristic Definition

Heuristic function h(n): estimates cost from node n to goal. h(goal) = 0. h(n) >= 0 for all n. Domain-specific: depends on problem. Not exact, approximation.

Example Heuristics

Traveling Salesman: h(n) = straight-line distance to unvisited closest city. Route planning: h(n) = straight-line distance to destination. Puzzle: h(n) = number misplaced tiles (Manhattan distance). Water jug: h(n) = difference from goal state.

Design Principles

Informed: heuristic should lower bound (underestimate) true cost. Relevant: heuristic captures aspect problem difficulty. Efficient: compute quickly (else no speedup). Selective: distinguish promising from unpromising nodes.

Relaxed Problems

Derive heuristics from relaxed versions. Remove constraints, problem becomes easier, optimal solution is heuristic. Example: 8-puzzle with relaxed movement (slide any tile anywhere) gives heuristic.

Pattern Databases

Pre-compute optimal solutions for subproblems. Use as heuristic for full problem. Example: optimal solution for subset of puzzle tiles cached, queried as heuristic.

Learning Heuristics

Machine learning: train neural networks to predict cost to goal. Example: learn to estimate remaining distance in game tree. Learns heuristic from experience.

Admissible Heuristics

Admissibility Definition

Heuristic h admissible if never overestimates true cost. h(n) <= actual cost from n to goal. Formally: h(n) <= h*(n) where h* true cost.

Importance

Admissible heuristics guarantee optimal solutions (with A*). Non-admissible: may find suboptimal solutions. Trade-off: admissible often weaker (less informative), non-admissible stronger but risky.

Examples

Straight-line distance: underestimates actual distance (straight-line is shortest). Misplaced tiles: underestimates moves needed (ignore constraints). Admissible.

If heuristic overestimates: biased. Route planning heuristic assuming "as crow flies" through obstacles: overestimates actual time. May find suboptimal routes.

Consistency (Monotonicity)

Consistent: h(n) <= cost(n,n') + h(n') for all neighbors n'. Stronger property: consistent implies admissible. Consistency prevents revising node costs (ensures optimal path properties).

Relationship to Dominance

h1 dominates h2 if h1(n) >= h2(n) for all n (h1 tighter bound). Dominance guarantees h1 more efficient than h2. Search with dominant heuristic expands fewer nodes.

A* Algorithm

Algorithm Definition

f(n) = g(n) + h(n)

where:
g(n) = actual cost from start to n
h(n) = estimated cost from n to goal
f(n) = estimated total cost via n

A* expands node with lowest f(n)

How A* Works

Combines: g(n) cost incurred (actual), h(n) estimated remaining. f(n) balances exploring closer nodes (low g) vs. promising nodes (low h). Expand lowest f until goal.

Pseudocode

def a_star(start, goal):
 open_list = PriorityQueue()
 open_list.add(start, f=0)
 closed_list = {}

 while open_list not empty:
 current = open_list.pop() # lowest f
 if current == goal: return reconstruct_path(current)

 closed_list[current] = g(current)
 for neighbor in neighbors(current):
 g_neighbor = g(current) + cost(current, neighbor)
 if neighbor not in closed_list or g_neighbor < closed_list[neighbor]:
 f_neighbor = g_neighbor + h(neighbor)
 open_list.add(neighbor, f_neighbor)

 return failure

Optimality Proof

A* optimal if h admissible. Proof: when goal found, it's lowest f in queue. Since h(goal) = 0, f(goal) = g(goal) = true cost. Any other path has higher f, thus higher cost. First goal found is optimal.

Completeness

A* complete in finite spaces (with consistent heuristic). Explores all nodes if necessary. May run forever in infinite graphs, but finds goal if exists in finite time.

Efficiency

Expands optimally: given heuristic class, A* expands fewest nodes (no other optimal algorithm with same heuristic expands fewer). Efficient if h good. Poor if h weak (becomes BFS-like).

Optimality and Completeness Proofs

Admissibility → Optimality

Theorem: A* with admissible h finds optimal solution. Proof sketch: first solution found has lowest f (by definition). h admissible means f = g + h <= g + h* = true cost. No other path (lower cost) has lower f.

Consistency → Optimality

Stronger: if h consistent, A* finds optimal solution and never reopens node (revises cost upward). Consistency implies admissible. But not all admissible are consistent.

Completeness with Consistency

Consistency ensures complete search: always explore promising paths, eventually reaching goal. Non-consistent: may exhaust open list without finding goal (infinite loops possible).

Counterexample: Non-Admissible

Non-admissible h may overestimate. First goal found might not be optimal. Example: h overestimating by 2 units finds path cost 10, optimal path exists cost 8.

Trade-Off with Efficiency

Admissible heuristics may expand more nodes (weaker estimates). Non-admissible may expand fewer but find suboptimal. Practical choice: accept slight suboptimality for significant efficiency gain.

Iterative Deepening A* (IDA*)

Algorithm Idea

IDA*: combines iterative deepening with A*. Iteratively increase threshold f_limit. Each iteration: depth-first search, only expand nodes with f(n) <= f_limit. Increase threshold, repeat.

Pseudocode

def ida_star(start, goal):
 threshold = h(start)

 while True:
 result, new_threshold = search(start, 0, threshold, goal)
 if result == goal: return reconstruct_path(result)
 if new_threshold == infinity: return failure
 threshold = new_threshold

def search(node, g, threshold, goal):
 f = g + h(node)
 if f > threshold: return (None, f)
 if node == goal: return (node, infinity)

 for neighbor in neighbors(node):
 result, new_threshold = search(neighbor, g + cost, threshold, goal)
 if result == goal: return (goal, infinity)
 new_threshold = min(new_threshold, new_threshold)

 return (None, new_threshold)

Memory Efficiency

IDA* uses O(d) memory where d = search depth. Contrast A*: O(N) where N = nodes explored. For large graphs, IDA* much better memory.

Time Complexity

Each iteration repeats work of previous (revisits nodes). Total time O(b^d) where b branching factor. Contrast A* O(b^d) as well (if good heuristic). Similar time, much less memory.

Advantages and Disadvantages

Advantage: memory-bounded, optimal. Disadvantage: redoes work across iterations, slower in practice than A* if memory permits. Use IDA* for space-constrained systems.

Variants

Depth-first A* (DFA*): similar iterative deepening on depth rather than f. Memory-Bounded A* (MBA*): limit open list size, drop least promising nodes.

Memory-Bounded Search

Problem with A*

A* stores all frontier nodes in memory. Large graphs: memory can exceed available. Large search space = huge open list = memory exhausted.

IDA* Solution

IDA* uses depth-first iteration: O(d) memory. Trade-off: recompute nodes across iterations. Effective for problems where time available more than memory.

Simplified Memory-Bounded A* (SMA*)

Like A* but limit memory. When memory full: drop node with highest f (least promising). Track forgotten nodes' costs. If reach node again, better paths possible. Optimizes use of limited memory.

Memory-Bounded Strategies

Breadth-first then backtrack: explore front level, backtrack when memory full. Prioritize based on promise: keep best nodes, discard worst. Cache: store intermediate results, reload as needed.

Practical Considerations

System memory: determine memory budget. Heuristic quality: good h more important than memory savings (fewer nodes to store). Problem size: memory constraints critical for huge problems.

Designing Effective Heuristics

Problem Relaxation

Remove constraints, problem becomes easier. Optimal solution of relaxed problem lower bound on original. Example: 8-puzzle, allow sliding tiles anywhere (not just adjacent) gives heuristic.

Subproblem Solutions

Solve subsets of problem optimally, cache solutions. Use as heuristic. Example: pattern databases precompute optimal solutions for goal subsets, query for heuristic value.

Multiple Heuristics

Compute multiple heuristics, take maximum (still admissible, stronger). Example: both Manhattan distance and misplaced tiles for 8-puzzle, use max.

Decomposable Heuristics

Decompose problem: h(n) = sum of heuristic values for independent subproblems. Admissible if subproblems non-overlapping. Enables modular heuristic design.

Learning from Experience

Solve instances, store solutions. Use patterns to predict heuristic. Machine learning: train neural network, learns heuristic from data. Generalizes to new instances.

Testing Heuristics

Empirical evaluation: run A* with candidate heuristics, measure nodes expanded. Better heuristic = fewer nodes. Compare search efficiency.

Applications in AI Planning and Games

Pathfinding in Games

Video game AI: A* finds shortest paths. Heuristic: Euclidean/Manhattan distance to goal. Efficient pathfinding for NPCs. Standard in game engines.

Puzzle Solving

8-puzzle, 15-puzzle: A* solves optimally. Heuristic: Manhattan distance (sum distances each tile from goal). Solves in practical time.

Robot Motion Planning

Plan collision-free paths. Configuration space: each point = robot position/orientation. A* searches configuration space. Heuristic: straight-line distance (ignoring obstacles).

AI Planning

Plan action sequences. State space: problem states. Actions: transitions. A* finds shortest plan. Heuristic: relaxed problem (fewer constraints), optimal plan relaxed problem.

Game Tree Search

Chess, games: alpha-beta pruning uses heuristic evaluation. Heuristic: estimate position value. Pruning: ignore clearly bad moves.

Knowledge Base Query

Find most likely explanation for observations. Search hypothesis space. Heuristic: agreement with known facts. A* guides search toward consistent hypotheses.

References

  • Hart, P. E., Nilsson, N. J., and Raphael, B. "A Formal Basis for the Heuristic Determination of Minimum Cost Paths." IEEE Transactions on Systems Science and Cybernetics, vol. 4, no. 2, 1968, pp. 100-107.
  • Russell, S. J., and Norvig, P. "Artificial Intelligence: A Modern Approach." Prentice Hall, 4th edition, 2020.
  • Korf, R. E. "Depth-First Iterative-Deepening: An Optimal Admissible Tree Search." Artificial Intelligence, vol. 27, 1985, pp. 97-109.
  • Korf, R. E. "Linear-Space Best-First Search: Summary of Results." AAAI, 1992.
  • Pohl, I. "Heuristic Search Viewed as Path Finding in Graphs." Artificial Intelligence, vol. 1, 1970, pp. 193-204.