Backward Chaining - Expert Systems

Introduction

Backward chaining is a goal-driven inference strategy where reasoning begins with a desired conclusion (goal) and works backward to find evidence supporting that goal. Instead of starting with known facts and deriving consequences, backward chaining starts with a target goal and recursively decomposes it into subgoals until reaching known facts or base cases.

The strategy is fundamental in logic programming (Prolog), expert systems, and theorem proving. It's especially effective when the goal space is small but the fact space is large, as it avoids exploring irrelevant facts and focuses computation on goal-relevant reasoning.

"Backward chaining asks 'How can we prove this goal?' rather than 'What can we derive from these facts?' This goal-driven focus makes it powerful for diagnostic and problem-solving tasks." -- Stuart Russell, UC Berkeley

Problem Definition

Knowledge Base Structure

Represented as production rules:

IF (condition1 AND condition2 AND ... AND conditionN) THEN conclusion

Example: IF (animal=mammal AND has_stripes) THEN zebra

Rules link premises (conditions) to conclusions. Facts are represented as base cases (rules with no conditions).

Goal and Subgoals

Given a goal (target to prove), the inference engine works backward:

Goal: zebra(X)
Can be proven if rule applies: animal(X) = mammal AND has_stripes(X)
New subgoals: mammal(X)?, has_stripes(X)?
Can be proven from facts: mammal(african_animal), has_stripes(plains_animal)

Success Condition

Goal succeeds if all subgoals eventually satisfy existing facts. Failure if no rule applies and goal isn't a fact.

Backward Chaining Algorithm

Basic Recursive Algorithm

function BACKWARD_CHAIN(goal, KB, visited):
 if goal is a known fact:
 return success with proof

 if goal in visited (cycle detection):
 return failure

 for each rule in KB where rule.conclusion unifies with goal:
 new_subgoals = rule.premises

 for each subgoal in new_subgoals:
  if not BACKWARD_CHAIN(subgoal, KB, visited + {goal}):
  continue to next rule

 # All subgoals succeeded
 return success with proof

 return failure

Key Steps

Goal Check: Is goal a known fact? If yes, success.
Rule Matching: Find rules whose conclusion matches goal
Subgoal Decomposition: For each matching rule, recursively prove premises
Backtracking: If subgoals fail, try next rule
Proof Construction: Track successful proof path

Example Execution

Rules:
 R1: parent(X,Y) AND male(X) -> father(X,Y)
 R2: parent(tom, bob) [fact]
 R3: male(tom)  [fact]

Query: father(tom, bob)?

Backward chain:
 father(tom, bob)?
 -> Match R1: need parent(tom, bob)? AND male(tom)?
 -> parent(tom, bob)? -> Match R2: SUCCESS
 -> male(tom)? -> Match R3: SUCCESS
 -> All subgoals proven: father(tom, bob) = TRUE

Unification and Pattern Matching

Unification Definition

Process of finding variable substitutions that make two logical expressions identical. Essential for matching goals with rule conclusions.

Pattern: father(X, bob)
Goal: father(tom, bob)
Unification: X = tom

Pattern: ancestor(X, Y)
Goal: ancestor(tom, bob)
Unification: X = tom, Y = bob

Unification Algorithm

function UNIFY(term1, term2, substitution):
 term1 = APPLY_SUBSTITUTION(term1, substitution)
 term2 = APPLY_SUBSTITUTION(term2, substitution)

 if term1 == term2:
 return substitution (success)

 if term1 is variable:
 if OCCURS_CHECK(term1, term2):
  return failure (prevents infinite structures)
 return substitution + {term1: term2}

 if term2 is variable:
 return substitution + {term2: term1}

 if both are compound terms with same functor/arity:
 for each argument pair:
  substitution = UNIFY(arg1_i, arg2_i, substitution)
  if unification fails:
  return failure
 return substitution

 return failure (cannot unify)

Most General Unifier (MGU)

If two terms unify, there exists a most general unifier - a unifier with minimal variable substitutions. Important for multiple solution generation (different variable bindings).

Occurs Check

Prevents infinite structures (e.g., X = f(X) is invalid). Essential for soundness; omitting it (as in some Prolog implementations) trades completeness for efficiency.

Search Strategies (DFS, BFS)

Depth-First Search (DFS)

Explore goals as deep as possible before backtracking. Standard in Prolog.

Advantages: Minimal memory, finds first solution quickly, naturally handles depth-limited search

Disadvantages: May get trapped in infinite loops, may not find shallowest solution

Breadth-First Search (BFS)

Explore all subgoals at current depth before deeper levels.

Advantages: Guaranteed to find shallowest solution, complete for finite search spaces

Disadvantages: High memory requirements, slower for single-solution queries

Iterative Deepening

Hybrid approach: DFS with increasing depth limits. Gets memory efficiency of DFS with completeness of BFS.

for depth = 0 to max_depth:
 result = DFS_with_depth_limit(goal, KB, depth)
 if result found:
 return result
return failure

Avoids infinite loops while keeping memory low. Small overhead: nodes near root revisited multiple times.

Heuristic Search

Use domain knowledge to prioritize promising goals. Example: in diagnostic systems, order rules by probability of success.

Proof Trees and Explanation

Proof Tree Structure

Backward chaining naturally generates a proof tree showing how goal was derived:

   father(tom, bob)
    / \
   / \
   parent(tom,bob) male(tom)
   |  |
   fact1 fact3

Root: goal; internal nodes: subgoals; leaves: facts or base cases.

Explanation Generation

By retaining proof tree, system can explain reasoning: "tom is father of bob because tom is parent of bob AND tom is male."

Why and How Queries

Why?: "Why is tom a father of bob?" Answer: Because he's a parent and male.

How?: "How can I prove someone is an ancestor?" Answer: By proving they're a parent or parent of ancestor.

Proof trees support natural language explanation critical for expert systems requiring trust and understanding.

Backward vs. Forward Chaining

Dimension	Backward Chaining	Forward Chaining
Direction	Goal-driven (hypothesis to evidence)	Data-driven (evidence to conclusions)
When to Use	Small goal space, large fact space	Few rules, many facts
Efficiency	Focused; avoids irrelevant facts	Explores all applicable rules
Explanation	Natural: "Why is this true?"	Less natural: "What follows?"
Complexity	Depends on goal depth	Can generate many irrelevant conclusions

Hybrid Approaches

Some systems use both: forward chaining to derive likely conclusions, backward chaining for specific queries. Or backward chaining with forward chaining sub-components.

Implementation Techniques

Choice Points and Backtracking

At each point where multiple rules match, create choice point. If subgoals fail, backtrack to choice point and try alternative rule. Efficiently implemented using stack of choice points.

Clause Indexing

Speed up rule lookup by pre-indexing rules by functor and first argument. Instead of searching all rules, access matching rules directly.

Tail Recursion Optimization

When last goal in rule body is recursive call, reuse stack frame (tail call optimization). Critical for logic programming efficiency.

Tabling / Memoization

Cache goals and their results to avoid re-computation. When goal encountered again, return cached result instead of re-proving.

table(ancestor(X, Y)).

Query ancestor(a, c)?
 -> Prove recursively, cache result
Query ancestor(a, c)? again
 -> Look up in table, return immediately

Optimizations and Pruning

Cut (!) Operator

Prolog's cut: once a clause chosen, prevent backtracking to other clauses for same goal. Improves efficiency by eliminating choice points.

max(X, Y, X) :- X >= Y, !. % Commit to this clause
max(X, Y, Y).  % Only tried if first clause fails

Negation as Failure

NOT goal succeeds if goal fails. Closed-world assumption: if not provable, assume false.

not_parent(X, Y) :- not parent(X, Y).

Query: not_parent(tom, bob)?
If parent(tom, bob) fails to prove, then not_parent succeeds.

Depth Limiting

Prevent infinite loops by limiting recursion depth. Trade-off: may miss valid proofs at greater depth.

Constraint Propagation

For constraint satisfaction problems (CSPs), propagate constraints to eliminate infeasible variable values early.

Applications

Diagnostic Expert Systems

Given symptoms (observations), find diseases (goals). Backward chaining naturally queries: "What disease explains these symptoms?"

Logic Programming

Prolog uses backward chaining. Programs written as rules; queries solved via backward chaining. Natural for symbolic reasoning, planning, NLP parsing.

Planning and Theorem Proving

AI planning: goal is desired state; backward chain finds action sequence. Automated theorem proving: goal is theorem; backward chain finds proof.

Question Answering

Given knowledge base, answer questions by backward chaining. Query: "Who is father of bob?" -> Backward chain to find X such that father(X, bob).

Knowledge Bases and Semantic Web

Query languages (SPARQL) use backward-chaining-like query execution over RDF knowledge graphs.

Limitations and Challenges

Infinite Loops

Recursive rules can lead to infinite backward chaining. Cycle detection and depth limiting necessary. Occurs check prevents some but not all infinite structures.

Handling Uncertainty

Pure backward chaining handles only binary logic. Extensions (Bayesian networks, fuzzy logic) needed for probabilistic reasoning.

Scalability

For large knowledge bases with many rules, backward chaining can be slow. Indexing and tabling help but not complete solution.

Interoperability with Learning

Traditional backward chaining assumes fixed, hand-crafted knowledge base. Integration with machine learning (learning rules from data) remains challenging.

Common Pitfalls

Inefficient rule ordering (try most likely rules first)
Redundant rules causing repeated computation
Unsafe negation (NOT) with negation-as-failure can produce unexpected results

References

Russell, S. J., and Norvig, P. "Artificial Intelligence: A Modern Approach." Prentice Hall, 3rd edition, 2009.
Clocksin, W. F., and Mellish, C. S. "Programming in Prolog." Springer, 5th edition, 2003.
Lloyd, J. W. "Foundations of Logic Programming." Springer-Verlag, 2nd edition, 1987.
Genesereth, M. R., and Nilsson, N. J. "Logical Foundations of Artificial Intelligence." Morgan Kaufmann, 1987.
Hanus, M. "Declarative Programming with Function Logic Languages." Journal of Functional and Logic Programming, 1999.