Introduction
Backward chaining is a goal-driven inference strategy where reasoning begins with a desired conclusion (goal) and works backward to find evidence supporting that goal. Instead of starting with known facts and deriving consequences, backward chaining starts with a target goal and recursively decomposes it into subgoals until reaching known facts or base cases.
The strategy is fundamental in logic programming (Prolog), expert systems, and theorem proving. It's especially effective when the goal space is small but the fact space is large, as it avoids exploring irrelevant facts and focuses computation on goal-relevant reasoning.
"Backward chaining asks 'How can we prove this goal?' rather than 'What can we derive from these facts?' This goal-driven focus makes it powerful for diagnostic and problem-solving tasks." -- Stuart Russell, UC Berkeley
Problem Definition
Knowledge Base Structure
Represented as production rules:
IF (condition1 AND condition2 AND ... AND conditionN) THEN conclusion
Example: IF (animal=mammal AND has_stripes) THEN zebra
Rules link premises (conditions) to conclusions. Facts are represented as base cases (rules with no conditions).
Goal and Subgoals
Given a goal (target to prove), the inference engine works backward:
Goal: zebra(X)
Can be proven if rule applies: animal(X) = mammal AND has_stripes(X)
New subgoals: mammal(X)?, has_stripes(X)?
Can be proven from facts: mammal(african_animal), has_stripes(plains_animal)
Success Condition
Goal succeeds if all subgoals eventually satisfy existing facts. Failure if no rule applies and goal isn't a fact.
Backward Chaining Algorithm
Basic Recursive Algorithm
function BACKWARD_CHAIN(goal, KB, visited):
if goal is a known fact:
return success with proof
if goal in visited (cycle detection):
return failure
for each rule in KB where rule.conclusion unifies with goal:
new_subgoals = rule.premises
for each subgoal in new_subgoals:
if not BACKWARD_CHAIN(subgoal, KB, visited + {goal}):
continue to next rule
# All subgoals succeeded
return success with proof
return failure
Key Steps
- Goal Check: Is goal a known fact? If yes, success.
- Rule Matching: Find rules whose conclusion matches goal
- Subgoal Decomposition: For each matching rule, recursively prove premises
- Backtracking: If subgoals fail, try next rule
- Proof Construction: Track successful proof path
Example Execution
Rules:
R1: parent(X,Y) AND male(X) -> father(X,Y)
R2: parent(tom, bob) [fact]
R3: male(tom) [fact]
Query: father(tom, bob)?
Backward chain:
father(tom, bob)?
-> Match R1: need parent(tom, bob)? AND male(tom)?
-> parent(tom, bob)? -> Match R2: SUCCESS
-> male(tom)? -> Match R3: SUCCESS
-> All subgoals proven: father(tom, bob) = TRUE
Unification and Pattern Matching
Unification Definition
Process of finding variable substitutions that make two logical expressions identical. Essential for matching goals with rule conclusions.
Pattern: father(X, bob)
Goal: father(tom, bob)
Unification: X = tom
Pattern: ancestor(X, Y)
Goal: ancestor(tom, bob)
Unification: X = tom, Y = bob
Unification Algorithm
function UNIFY(term1, term2, substitution):
term1 = APPLY_SUBSTITUTION(term1, substitution)
term2 = APPLY_SUBSTITUTION(term2, substitution)
if term1 == term2:
return substitution (success)
if term1 is variable:
if OCCURS_CHECK(term1, term2):
return failure (prevents infinite structures)
return substitution + {term1: term2}
if term2 is variable:
return substitution + {term2: term1}
if both are compound terms with same functor/arity:
for each argument pair:
substitution = UNIFY(arg1_i, arg2_i, substitution)
if unification fails:
return failure
return substitution
return failure (cannot unify)
Most General Unifier (MGU)
If two terms unify, there exists a most general unifier - a unifier with minimal variable substitutions. Important for multiple solution generation (different variable bindings).
Occurs Check
Prevents infinite structures (e.g., X = f(X) is invalid). Essential for soundness; omitting it (as in some Prolog implementations) trades completeness for efficiency.
Search Strategies (DFS, BFS)
Depth-First Search (DFS)
Explore goals as deep as possible before backtracking. Standard in Prolog.
Advantages: Minimal memory, finds first solution quickly, naturally handles depth-limited search
Disadvantages: May get trapped in infinite loops, may not find shallowest solution
Breadth-First Search (BFS)
Explore all subgoals at current depth before deeper levels.
Advantages: Guaranteed to find shallowest solution, complete for finite search spaces
Disadvantages: High memory requirements, slower for single-solution queries
Iterative Deepening
Hybrid approach: DFS with increasing depth limits. Gets memory efficiency of DFS with completeness of BFS.
for depth = 0 to max_depth:
result = DFS_with_depth_limit(goal, KB, depth)
if result found:
return result
return failure
Avoids infinite loops while keeping memory low. Small overhead: nodes near root revisited multiple times.
Heuristic Search
Use domain knowledge to prioritize promising goals. Example: in diagnostic systems, order rules by probability of success.
Proof Trees and Explanation
Proof Tree Structure
Backward chaining naturally generates a proof tree showing how goal was derived:
father(tom, bob)
/ \
/ \
parent(tom,bob) male(tom)
| |
fact1 fact3
Root: goal; internal nodes: subgoals; leaves: facts or base cases.
Explanation Generation
By retaining proof tree, system can explain reasoning: "tom is father of bob because tom is parent of bob AND tom is male."
Why and How Queries
Why?: "Why is tom a father of bob?" Answer: Because he's a parent and male.
How?: "How can I prove someone is an ancestor?" Answer: By proving they're a parent or parent of ancestor.
Proof trees support natural language explanation critical for expert systems requiring trust and understanding.
Backward vs. Forward Chaining
| Dimension | Backward Chaining | Forward Chaining |
|---|---|---|
| Direction | Goal-driven (hypothesis to evidence) | Data-driven (evidence to conclusions) |
| When to Use | Small goal space, large fact space | Few rules, many facts |
| Efficiency | Focused; avoids irrelevant facts | Explores all applicable rules |
| Explanation | Natural: "Why is this true?" | Less natural: "What follows?" |
| Complexity | Depends on goal depth | Can generate many irrelevant conclusions |
Hybrid Approaches
Some systems use both: forward chaining to derive likely conclusions, backward chaining for specific queries. Or backward chaining with forward chaining sub-components.
Implementation Techniques
Choice Points and Backtracking
At each point where multiple rules match, create choice point. If subgoals fail, backtrack to choice point and try alternative rule. Efficiently implemented using stack of choice points.
Clause Indexing
Speed up rule lookup by pre-indexing rules by functor and first argument. Instead of searching all rules, access matching rules directly.
Tail Recursion Optimization
When last goal in rule body is recursive call, reuse stack frame (tail call optimization). Critical for logic programming efficiency.
Tabling / Memoization
Cache goals and their results to avoid re-computation. When goal encountered again, return cached result instead of re-proving.
table(ancestor(X, Y)).
Query ancestor(a, c)?
-> Prove recursively, cache result
Query ancestor(a, c)? again
-> Look up in table, return immediately
Optimizations and Pruning
Cut (!) Operator
Prolog's cut: once a clause chosen, prevent backtracking to other clauses for same goal. Improves efficiency by eliminating choice points.
max(X, Y, X) :- X >= Y, !. % Commit to this clause
max(X, Y, Y). % Only tried if first clause fails
Negation as Failure
NOT goal succeeds if goal fails. Closed-world assumption: if not provable, assume false.
not_parent(X, Y) :- not parent(X, Y).
Query: not_parent(tom, bob)?
If parent(tom, bob) fails to prove, then not_parent succeeds.
Depth Limiting
Prevent infinite loops by limiting recursion depth. Trade-off: may miss valid proofs at greater depth.
Constraint Propagation
For constraint satisfaction problems (CSPs), propagate constraints to eliminate infeasible variable values early.
Applications
Diagnostic Expert Systems
Given symptoms (observations), find diseases (goals). Backward chaining naturally queries: "What disease explains these symptoms?"
Logic Programming
Prolog uses backward chaining. Programs written as rules; queries solved via backward chaining. Natural for symbolic reasoning, planning, NLP parsing.
Planning and Theorem Proving
AI planning: goal is desired state; backward chain finds action sequence. Automated theorem proving: goal is theorem; backward chain finds proof.
Question Answering
Given knowledge base, answer questions by backward chaining. Query: "Who is father of bob?" -> Backward chain to find X such that father(X, bob).
Knowledge Bases and Semantic Web
Query languages (SPARQL) use backward-chaining-like query execution over RDF knowledge graphs.
Limitations and Challenges
Infinite Loops
Recursive rules can lead to infinite backward chaining. Cycle detection and depth limiting necessary. Occurs check prevents some but not all infinite structures.
Handling Uncertainty
Pure backward chaining handles only binary logic. Extensions (Bayesian networks, fuzzy logic) needed for probabilistic reasoning.
Scalability
For large knowledge bases with many rules, backward chaining can be slow. Indexing and tabling help but not complete solution.
Interoperability with Learning
Traditional backward chaining assumes fixed, hand-crafted knowledge base. Integration with machine learning (learning rules from data) remains challenging.
Common Pitfalls
- Inefficient rule ordering (try most likely rules first)
- Redundant rules causing repeated computation
- Unsafe negation (NOT) with negation-as-failure can produce unexpected results
References
- Russell, S. J., and Norvig, P. "Artificial Intelligence: A Modern Approach." Prentice Hall, 3rd edition, 2009.
- Clocksin, W. F., and Mellish, C. S. "Programming in Prolog." Springer, 5th edition, 2003.
- Lloyd, J. W. "Foundations of Logic Programming." Springer-Verlag, 2nd edition, 1987.
- Genesereth, M. R., and Nilsson, N. J. "Logical Foundations of Artificial Intelligence." Morgan Kaufmann, 1987.
- Hanus, M. "Declarative Programming with Function Logic Languages." Journal of Functional and Logic Programming, 1999.