Introduction
Genetic algorithms (GAs) evolutionary algorithms mimicking natural selection. Population of candidate solutions (chromosomes) evolve over generations. Fittest solutions survive, reproduce, mutate. After many generations, population converges to high-quality solutions.
Applicable to optimization problems: parameter tuning, scheduling, design optimization, machine learning hyperparameter search. Strengths: global search (avoid local minima), parallelizable, handle non-linear/multi-modal landscapes. Weaknesses: slow convergence, no optimality guarantees, parameter tuning required.
Inspired by Darwin's natural selection. Stronger organisms reproduce, pass genes to offspring. Over time, population adapts to environment. GAs apply same principles to abstract problem spaces: better solutions persist, combine (crossover), occasionally mutate (exploration).
"Genetic algorithms harness evolution's power to solve complex optimization problems. Where traditional methods fail in rugged landscapes, GAs excel through parallel population-based search." -- Evolutionary computation research
Evolution and Natural Selection Principles
Natural Selection
Environment exerts pressure: organisms well-adapted to environment survive and reproduce. Over generations, population composition shifts toward more adapted individuals. Driving force: differential reproduction (fitter organisms produce more offspring).
Genetic Variation
Mutations create variation. Sexual reproduction (genetic recombination) combines genes. Variation provides material for selection: without variation, nothing to select.
Inheritance
Offspring inherit genes from parents. Beneficial mutations persist if they improve fitness. Harmful mutations eliminated by selection pressure.
GA Analogy
| Nature | Genetic Algorithm |
|---|---|
| Organism | Candidate solution (chromosome) |
| Fitness (survival) | Fitness function score |
| Genes | Problem parameters |
| Reproduction | Selection and crossover |
| Mutation | Random parameter perturbation |
Genetic Algorithm Overview
Algorithm Steps
1. Initialize: create random population of solutions
2. Evaluate: compute fitness for each solution
3. Select: choose fittest solutions (with probability proportional to fitness)
4. Crossover: combine selected solutions to create offspring
5. Mutate: randomly modify offspring genes
6. Replace: form new population
7. Repeat: steps 2-6 until convergence (fitness plateaus or max generations)
Pseudocode
def genetic_algorithm(population_size, num_generations):
population = [random_solution() for _ in range(population_size)]
for generation in range(num_generations):
# Evaluate
fitness = [evaluate(solution) for solution in population]
# Select
parents = select_proportional_to_fitness(population, fitness)
# Crossover and Mutate
offspring = []
for _ in range(population_size):
parent1, parent2 = random.sample(parents, 2)
child = crossover(parent1, parent2)
child = mutate(child)
offspring.append(child)
# Replace
population = offspring
return best_solution(population, fitness)
Iterative Improvement
GA iteratively improves population: each generation, fitter solutions more likely to pass genes. Population gradually converges toward optimum (or local optimum).
Representation: Chromosomes and Genes
Binary Representation
Classic: solution encoded as bit string. Gene = single bit (0 or 1). Chromosome = sequence of bits. Example: 8-bit integer represented as "11010110" (214).
Real-Valued Representation
For continuous optimization: chromosome = vector of real numbers. Gene = single real value. Mutation: add Gaussian noise. Crossover: blend or swap values.
Permutation Representation
Traveling Salesman Problem (TSP): chromosome = ordering of cities. Gene = city position. Crossover: preserve city order. Mutation: swap cities.
Tree-Based Representation
Genetic programming: solution = expression tree (program). Crossover: swap subtrees. Mutation: modify subtree. Enables evolving programs.
Problem-Specific Representation
Design choice critical: representation should facilitate crossover/mutation, map naturally to problem. Poor representation → poor GA performance.
Fitness Function Design
Fitness Definition
Function assigning score to solution. Higher score = better solution. GA drives population toward high fitness.
Examples:
- Maximize f(x) = -(x-5)^2 + 10: max at x=5, fitness peaks there
- Minimize cost: fitness = 1 / (cost + epsilon) (inverse)
- Classification accuracy: fitness = percent correct predictions
Properties of Good Fitness Functions
- Smooth landscape: nearby solutions similar fitness (enables search)
- Clear gradient: solutions improving toward optimum have higher fitness
- Computational efficiency: evaluated millions of times (must be fast)
- Alignment with goal: optimizing fitness = solving problem
Fitness Shaping
Transform fitness to guide search. Example: f(x) = x^2 strong selection (big fitness differences). f(x) = sqrt(x) weak selection (smaller differences, more exploration).
Multi-Objective Fitness
Multiple objectives: scalarize (weighted sum) or Pareto-based. Pareto: maintain solutions non-dominated (can't improve one without worsening another).
Selection Strategies
Fitness-Proportional Selection
Probability of selection proportional to fitness. P(select i) = fitness(i) / sum(fitness). Fitter solutions more likely chosen as parents.
Example:
Fitness: [10, 20, 30]
Total: 60
Probabilities: [10/60, 20/60, 30/60] = [1/6, 1/3, 1/2]
Solution 3 twice as likely selected as solution 2.
Tournament Selection
Randomly sample k solutions, select fittest. Repeated to fill parent pool. Simpler, more robust than fitness-proportional.
Ranking Selection
Rank solutions by fitness, selection probability based on rank (not fitness value). Handles fitness scales well. Elite always selected.
Elitism
Preserve best solutions (copy to next generation without modification). Prevents losing good solutions, monotonic improvement.
Crossover and Recombination
One-Point Crossover
Parent 1: 11010110
Parent 2: 10101001
Split point: 4
Child 1: 1101 | 1001 = 11011001
Child 2: 1010 | 0110 = 10100110
Children combine genetic material from both parents.
Two-Point Crossover
Two split points. Middle segment swapped. Reduces schema disruption (long, useful patterns).
Uniform Crossover
Each gene independently: 50% from parent 1, 50% from parent 2. Maximum mixing.
Real-Valued Crossover
Blend crossover: child = w * parent1 + (1-w) * parent2 (weighted average). BLX-α: blend with random extension.
Purpose of Crossover
Combine good features from different parents. Exploration of solution space. If parents fitter than average, offspring likely fitter.
Mutation Operators
Bit-Flip Mutation
Randomly flip bits with probability p_m (mutation rate). Provides variation, prevents premature convergence.
Example (p_m = 1/8):
Before: 11010110
Flip bit 3: 11000110
Mutation introduces new genetic variation.
Gaussian Mutation (Real-Valued)
Add Gaussian noise to genes: x' = x + N(0, σ). Larger σ = bigger mutations.
Mutation Rate
Too low: insufficient variation, premature convergence. Too high: random search, disrupts good solutions. Typical: p_m = 1 / chromosome_length.
Adaptive Mutation
Mutation rate changes over evolution. Start high (exploration), decrease over time (exploitation). Or adjust per gene based on convergence.
Purpose of Mutation
Introduces variation preventing population stagnation. Enables escaping local optima (though GAs easily trapped). Essential for exploration.
Population Size and Convergence
Population Size Effect
Large population: diverse, slow convergence, expensive. Small population: fast but may converge prematurely. Typical: 50-500 (problem-dependent).
Number of Generations
Run until fitness plateau (no improvement over G consecutive generations) or max generations reached. Trade-off: more generations = better solutions but more computation.
Convergence Criteria
- Maximum generations elapsed
- Fitness unchanged for G generations (stagnation)
- All solutions converged (same genes)
- Time/resource limit
Convergence Behavior
Typical: rapid initial improvement, plateaus. S-shaped curve: slow start, rapid middle, asymptotic tail.
Premature Convergence and Diversity
Premature Convergence
Population converges to local optimum before finding global best. Causes: low mutation, small population, strong selection pressure.
Maintaining Diversity
- Higher mutation: Introduces variation constantly
- Larger population: More genetic material preserved
- Fitness sharing: Reduce fitness of similar solutions (rewards diversity)
- Niche techniques: Separate population into subgroups
No Free Lunch
GA excellent for some problems, poor for others. No algorithm optimal across all problems. Success depends on problem structure, representation, parameter tuning.
Robustness
GAs relatively robust: work across diverse problem classes, less sensitive to initial conditions than gradient descent. But parameter tuning important for performance.
Applications in Engineering and Design
Parameter Optimization
Tune parameters of complex system (neural networks, control systems). GA searches parameter space, improves performance.
Engineering Design
Design aircraft wings, car bodies, mechanical structures. Objectives: minimize weight, maximize strength. GA explores design space, balances trade-offs.
Scheduling
Job shop scheduling, school timetables: assign tasks to time slots. Fitness = meeting constraints, minimizing conflicts. NP-hard problem, GA finds good (not optimal) solutions.
Traveling Salesman Problem (TSP)
Find shortest tour visiting all cities. GA with permutation representation, crossover preserving order. Efficient for medium-size instances.
Machine Learning
Hyperparameter tuning: learning rate, layer sizes, regularization. GA alternative to grid/random search. Also used for feature selection.
Game AI
Evolve strategies: GA used to evolve neural network weights, game strategies, team tactics. Enables emergence of complex behavior.
Scientific Discovery
Symbolic regression: evolve mathematical expressions fitting data. Automated hypothesis generation. Genetic programming variant.
References
- Holland, J. H. "Adaptation in Natural and Artificial Systems." University of Michigan Press, 1975.
- Goldberg, D. E. "Genetic Algorithms in Search, Optimization, and Machine Learning." Addison-Wesley, 1989.
- Mitchell, M. "An Introduction to Genetic Algorithms." MIT Press, 1996.
- Koza, J. R. "Genetic Programming: On the Programming of Computers by Means of Natural Selection." MIT Press, 1992.
- Michalewicz, Z. "Genetic Algorithms + Data Structures = Evolution Programs." Springer, 3rd edition, 1996.