Introduction
First-order logic (FOL), also called predicate logic, extends propositional logic with predicates, variables, and quantifiers. Expressive than propositional: can represent relationships between objects, quantify over properties. Foundation for knowledge-based systems, automated reasoning, logic programming.
Key enhancement: instead of atomic propositions (true/false), FOL reasons about predicates applied to objects. Example: "All birds fly" inexpressible in propositional logic, easily expressed in FOL: ∀x (Bird(x) → Flies(x)).
Trade-off: increased expressiveness comes at cost. Propositional logic decidable (can always determine if formula valid). FOL semi-decidable: can prove entailment but no guarantee of termination for non-entailed formulas. No complete, efficient algorithm exists.
"First-order logic strikes balance: expressive enough for most knowledge representation, yet sufficiently tractable for reasoning. Most AI systems built on FOL foundations." -- Knowledge representation theory
Limitations of Propositional Logic
Lack of Quantification
Propositional logic cannot express "all" or "some". Cannot say "all students need to study". Must individually state: Student1_NeedsStudy, Student2_NeedsStudy, ... impractical for infinite domains.
No Relationships Between Objects
Cannot relate objects: Parent(john, mary). Must use separate propositions: john_is_parent_of_mary. Scales poorly, loses structure.
Limited Reasoning Patterns
Pattern: if X has property A, then X has property B. Propositional logic cannot express this universally. Must manually encode all instances.
Knowledge Explosion
Knowledge base grows exponentially. For N objects and M relations, need O(N^M) propositions. First-order logic avoids explosion: single predicate covers all instances.
FOL Solution
Variables allow generalizing over objects. Quantifiers express "all" (∀) and "some" (∃). Predicates represent relationships. Together enable compact, expressive knowledge bases.
Syntax and Building Blocks
Symbols
FOL alphabet: constants (a, b, john), variables (x, y, z), predicates (P, Q, Father), functions (f, father), logical connectives (¬, ∧, ∨, →, ↔), quantifiers (∀, ∃), equality (=).
Constants and Variables
Constants denote specific objects (john, mary, 5). Variables range over objects (x, y). Variables bound by quantifiers or free (unbound). Difference key to understanding semantics.
Predicates
Predicate symbol P with arity n (P/n). P(x) unary predicate (property). P(x, y) binary (relation). Example: Father(x, y) means x is father of y.
Functions
Function symbols map objects to objects. father(x) returns father of x. Allows complex terms: grandfather(john) = father(father(john)).
Terms
Term: constant, variable, or function applied to terms. Examples: john (constant), x (variable), father(john) (function term). Terms are building blocks for formulas.
Predicates and Relations
Unary Predicates (Properties)
Predicate P(x) asserts property of x. Examples: Mortal(x), Bird(x), Student(x). Interpretation: assigns true/false to P(a) for each object a.
Binary Predicates (Relations)
Predicate P(x, y) relates two objects. Examples: Parent(x, y), Likes(x, y), Married(x, y). Reflexive (x R x), symmetric (x R y ⟹ y R x), transitive (x R y ∧ y R z ⟹ x R z) possible.
Higher Arity Predicates
Ternary, quaternary, etc. Example: Teaches(professor, student, course) three-place predicate. Arity selection crucial: too few loses information, too many complicates representation.
Encoding Relations
Strategy: choose predicates matching problem domain. Medical diagnosis: Disease(x), Symptom(x), HasSymptom(patient, symptom). Consistent naming conventions improve readability.
Special Predicates
Equality (=): x = y means x and y denote same object. Essential for identity statements. Example: father(john) = father(mary) means john and mary have same father.
Quantifiers: Universal and Existential
Universal Quantifier (∀)
∀x P(x) means "for all x, P(x) holds". True iff P(a) true for every object a in domain. Example: ∀x (Bird(x) → Flies(x)) means all birds fly.
Examples:
∀x (Mortal(x)) -- everything is mortal
∀x (Human(x) → Mortal(x)) -- all humans mortal
∀x ∀y (Parent(x, y) → Ancestor(x, y)) -- all parents ancestors
Existential Quantifier (∃)
∃x P(x) means "there exists x such that P(x) holds". True iff P(a) true for at least one object a. Example: ∃x (Bird(x) ∧ Flies(x)) means some bird flies.
Examples:
∃x (King(x)) -- there is at least one king
∃x (Human(x) ∧ Wise(x)) -- some human wise
∃y (Parent(x, y)) -- x has at least one parent
Scope and Nesting
Quantifier scope determines which variables bound. ∀x (P(x) ∨ Q(y)) binds x but not y. Nesting: ∀x ∃y Parent(x, y) means everyone has parent (existential under universal). ∃y ∀x Parent(x, y) means someone parent of everyone.
Free and Bound Variables
Variable bound if appears within scope of quantifier. Free if unbound. Formula with free variables denotes predicate (not truth value). Example: P(x) ∨ Q(y) has free variables x, y. ∀x (P(x) ∨ Q(y)) binds x, y remains free.
Closed Formulas (Sentences)
Formula with no free variables called sentence or closed formula. Has definite truth value under interpretation. Open formulas (free variables) denote properties/relations.
Well-Formed Formulas (WFF) in FOL
Syntax Rules
Well-formed formula defined recursively: (1) atomic formula P(t₁,...,tₙ) is WFF, (2) if F, G WFF then ¬F, (F∧G), (F∨G), (F→G), (F↔G) WFF, (3) if F WFF and x variable, then ∀x F and ∃x F WFF.
Operator Precedence
Bind tightest to loosest: quantifiers, ¬, ∧, ∨, →, ↔. Example: ∀x P(x) ∨ Q(x) parses as (∀x P(x)) ∨ Q(x). Use parentheses liberally for clarity.
Examples of WFFs
Atomic: Father(john, mary), x = 5
Negation: ¬Mortal(x)
Conjunction: Human(x) ∧ Mortal(x)
Quantified: ∀x (Human(x) → Mortal(x))
Complex: ∀x (Human(x) → ∃y (Parent(x, y) ∧ Human(y)))
Prenex Normal Form
All quantifiers at front, followed by quantifier-free formula. Every FOL formula convertible to prenex form. Useful for reasoning: ∀x ∃y (P(x) ∧ Q(x, y)).
Clause Form
Formula as conjunction of clauses. Clause: disjunction of literals (atom or negated atom). Example: (¬P(x) ∨ Q(x)) ∧ (R(x) ∨ ¬S(x)). Enables resolution theorem proving.
Semantics and Interpretations
Interpretation (Model)
Interpretation specifies domain (set of objects) and assigns meaning to symbols. Domain non-empty. Constants assigned objects. Predicates assigned relations (sets of tuples). Functions assigned mappings.
Example interpretation:
Domain: {john, mary, bob}
john = john, mary = mary, bob = bob
Father(john, mary) = true
Father(john, bob) = true
Father(mary, john) = false
... (all other atoms assigned truth values)
Evaluating Formulas
Given interpretation, recursively evaluate formula. Atomic: check predicate assignment. Connectives: apply truth functions. Quantifier: ∀x F true iff F true for all assignments to x. ∃x F true iff F true for some assignment.
Satisfiability and Validity
Formula satisfiable if exists interpretation making it true. Valid (tautology) if true in all interpretations. Unsatisfiable (contradiction) if false in all interpretations.
Entailment (Logical Consequence)
KB ⊨ Q (KB entails Q) iff every interpretation making KB true also makes Q true. Check: is KB ∧ ¬Q unsatisfiable?
Herbrand Semantics
Simplification: restrict domain to terms constructible from constants and functions (Herbrand universe). Reduces infinite domains. Not always sound but practical for many problems.
Inference Rules and Proof Systems
Modus Ponens (Generalized)
P(a)
P(x) → Q(x)
-----------
Q(a)
If P true of specific object a, and P implies Q generally, then Q true of a.
Universal Instantiation
From ∀x P(x), derive P(a) for any constant a. Instance of universal statement. Essential for applying general rules to specific cases.
Universal Generalization
From P(x) (where x arbitrary), derive ∀x P(x). Valid only if x doesn't appear free in assumptions. Critical restriction prevents incorrect generalizations.
Existential Instantiation
From ∃x P(x), introduce new constant (Skolem constant) a such that P(a). Justified: if something satisfies P, name it.
Existential Generalization
From P(a), derive ∃x P(x). Property true of specific object, so exists object with property.
Soundness and Completeness
First-order logic sound: provable formulas true. Complete: all valid formulas provable (Gödel completeness). But semi-decidable: no algorithm guaranteed to find proof for all entailments (halting problem).
Unification and Substitution
Substitution
Substitution σ = {x₁/t₁, x₂/t₂, ...} maps variables to terms. Applied to formula: Pσ replaces all free occurrences. Example: P(x, y)σ where σ = {x/john, y/mary} gives P(john, mary).
Unification Problem
Given two terms, find substitution making them identical. Example: unify P(x, mary) and P(john, y)? Unifier σ = {x/john, y/mary}. Most general unifier (mgu) exists if unifiable.
Unification Algorithm
def unify(t1, t2, σ = {}):
if t1 = t2: return σ
if t1 is variable:
if t1 in t2: return FAIL (occurs check)
return unify(rest, t2σ, σ ∪ {t1/t2})
if t2 is variable:
if t2 in t1: return FAIL
return unify(t1, rest, σ ∪ {t2/t1})
if t1, t2 compound:
if functor/arity differ: return FAIL
return unify args recursively
return FAIL
Occurs Check
Prevents x unifying with term containing x (x = f(x) impossible). Essential for correctness, but expensive. Some implementations omit for efficiency.
Applications
Unification foundation for resolution, logic programming. Prolog queries matched against facts via unification. Database: find matching records using unification.
Resolution in First-Order Logic
Resolution Rule
L₁ ∨ ... ∨ Lₙ
M₁ ∨ ... ∨ Mₘ
----------------
(L₁ ∨ ... ∨ Lₙ ∨ M₁ ∨ ... ∨ Mₘ) where unifies with ¬Lᵢ or Mⱼ
Resolvent: disjunction minus complementary literals after substitution.
Resolution Refutation
Prove KB ⊨ Q: negate Q, convert KB ∪ {¬Q} to clause form, apply resolution until empty clause (⊥) derived. Empty clause contradiction, so Q must follow from KB.
Algorithm
1. Convert KB ∪ {¬Q} to clause form
2. new_clauses = {}
3. While ⊥ not derived:
- Select two resolvable clauses
- Compute resolvent via unification
- If resolvent = ⊥: return TRUE (Q provable)
- Add resolvent to set
- If no new clauses: return FALSE (Q not provable)
- Repeat
Completeness
Resolution refutation complete for first-order logic: if KB ⊨ Q, eventually derives ⊥. But may not terminate if KB ⊭ Q (semi-decidable).
Practical Issues
Clause explosion: applying resolution generates many new clauses. Search space large. Heuristics: subsumption (remove redundant clauses), selection strategies (choose productive pairs), restrict depth.
Logic Programming and Prolog
Logic Programming Paradigm
Programming as logical inference. Program = set of logical rules (clauses). Query = goal to prove. Execution = search for proof. Declarative: specify what, not how.
Prolog Clauses
Prolog clause: head :- body. Head: atom. Body: conjunction of atoms (goals). Fact: clause with empty body. Example: parent(john, mary). Rule: parent(X, Z) :- parent(X, Y), parent(Y, Z).
Example Prolog program:
parent(tom, bob).
parent(tom, liz).
parent(bob, ann).
parent(bob, pat).
grandfather(X, Z) :- parent(X, Y), parent(Y, Z).
?- grandfather(tom, ann). % Query: is tom grandfather of ann?
Answer: yes (unifies: X/tom, Z/ann, Y/bob)
Backward Chaining
Query execution: start with goal, find matching clauses, recursively prove subgoals. Depth-first search: explore one branch fully before backtracking. Build proof tree.
Unification and Backtracking
Query terms unified with clause heads. Multiple clauses match? Try each via backtracking. On failure, backtrack to choice point, try alternative. Search systematically.
Cut (!) Operator
Cut prevents backtracking past it. Commits to current clause choice. Improves efficiency, enables non-monotonic reasoning. Example: max(X, Y, X) :- X >= Y, !. max(X, Y, Y).
Limitations
Negation as failure: ¬Goal true if Goal fails. Unsound for some problems (closed-world assumption: what not provable is false). Depth-first search may loop infinitely (not complete for cyclic derivations).
Decidability and Complexity
Semi-Decidability
FOL entailment semi-decidable: algorithm exists to prove valid formulas, but no algorithm guaranteed to terminate on invalid formulas. Contrast propositional logic (decidable).
Undecidable Fragments
Full first-order logic undecidable (Church-Turing theorem). Restricted fragments decidable: monadic FOL (predicates arity ≤1), two-variable FOL, Datalog (no function symbols).
Complexity
Satisfiability checking worst-case exponential. Practical performance varies: some problems solved quickly, others intractable. SAT solvers, SMT solvers (satisfiability modulo theories) exploit structure.
Herbrand's Theorem
Formula satisfiable iff ground instances (terms without variables) satisfiable. Reduces FOL to propositional: enumerate Herbrand universe, create propositional instance. Theoretically complete but often infinite instances needed.
Practical Reasoning
Real systems don't aim for complete reasoning. Use incomplete methods (heuristic search, constraint propagation) trading completeness for efficiency. Restrict domain, add assumptions (closed-world, unique-name), specialize to problem structure.
Applications in AI Systems
Knowledge-Based Systems
Expert systems encode domain knowledge as FOL rules. Engine performs forward/backward chaining to answer queries. Medical diagnosis: encode symptoms, diseases, relationships. Query: what disease explains symptoms?
Semantic Web and Ontologies
RDF/OWL web ontologies: facts, properties, relationships. Reasoning engines (reasoners) derive new facts via FOL inference. Enable knowledge sharing, integration across systems.
Natural Language Processing
Semantic role labeling, temporal reasoning, question answering. FOL represents sentence meaning: parse "John bought Mary a book" to Buy(john, mary, book). Enables structured inference.
Planning and Verification
AI planning: encode actions as FOL rules. State-space search finds sequence achieving goal. Software verification: encode program semantics, verify properties via theorem proving.
Database Query Languages
SQL, Datalog based on FOL semantics. Query = logical formula. Database = model. Query evaluation = satisfiability checking. Enables reasoning over large structured data.
Machine Learning Interpretability
Learning rule-based classifiers: induce FOL rules from data. More interpretable than neural networks. Explainability: learned rules provide symbolic explanations for predictions.
References
- Russell, S. J., and Norvig, P. "Artificial Intelligence: A Modern Approach." Prentice Hall, 4th edition, 2020.
- Mendelson, E. "Introduction to Mathematical Logic." Chapman & Hall, 6th edition, 2015.
- Fitting, M. "First-Order Logic and Automated Theorem Proving." Springer, 2nd edition, 1996.
- Genesereth, M. R., and Nilsson, N. J. "Logical Foundations of Artificial Intelligence." Morgan Kaufmann, 1987.
- Lloyd, J. W. "Foundations of Logic Programming." Springer, 2nd edition, 1987.
- Robinson, J. A. "Logic and Logic Programming." Oxford University Press, 1992.