Introduction

Genetic algorithms (GAs) evolutionary algorithms mimicking natural selection. Population of candidate solutions (chromosomes) evolve over generations. Fittest solutions survive, reproduce, mutate. After many generations, population converges to high-quality solutions.

Applicable to optimization problems: parameter tuning, scheduling, design optimization, machine learning hyperparameter search. Strengths: global search (avoid local minima), parallelizable, handle non-linear/multi-modal landscapes. Weaknesses: slow convergence, no optimality guarantees, parameter tuning required.

Inspired by Darwin's natural selection. Stronger organisms reproduce, pass genes to offspring. Over time, population adapts to environment. GAs apply same principles to abstract problem spaces: better solutions persist, combine (crossover), occasionally mutate (exploration).

"Genetic algorithms harness evolution's power to solve complex optimization problems. Where traditional methods fail in rugged landscapes, GAs excel through parallel population-based search." -- Evolutionary computation research

Evolution and Natural Selection Principles

Natural Selection

Environment exerts pressure: organisms well-adapted to environment survive and reproduce. Over generations, population composition shifts toward more adapted individuals. Driving force: differential reproduction (fitter organisms produce more offspring).

Genetic Variation

Mutations create variation. Sexual reproduction (genetic recombination) combines genes. Variation provides material for selection: without variation, nothing to select.

Inheritance

Offspring inherit genes from parents. Beneficial mutations persist if they improve fitness. Harmful mutations eliminated by selection pressure.

GA Analogy

Nature Genetic Algorithm
Organism Candidate solution (chromosome)
Fitness (survival) Fitness function score
Genes Problem parameters
Reproduction Selection and crossover
Mutation Random parameter perturbation

Genetic Algorithm Overview

Algorithm Steps

1. Initialize: create random population of solutions
2. Evaluate: compute fitness for each solution
3. Select: choose fittest solutions (with probability proportional to fitness)
4. Crossover: combine selected solutions to create offspring
5. Mutate: randomly modify offspring genes
6. Replace: form new population
7. Repeat: steps 2-6 until convergence (fitness plateaus or max generations)

Pseudocode

def genetic_algorithm(population_size, num_generations):
 population = [random_solution() for _ in range(population_size)]

 for generation in range(num_generations):
 # Evaluate
 fitness = [evaluate(solution) for solution in population]

 # Select
 parents = select_proportional_to_fitness(population, fitness)

 # Crossover and Mutate
 offspring = []
 for _ in range(population_size):
 parent1, parent2 = random.sample(parents, 2)
 child = crossover(parent1, parent2)
 child = mutate(child)
 offspring.append(child)

 # Replace
 population = offspring

 return best_solution(population, fitness)

Iterative Improvement

GA iteratively improves population: each generation, fitter solutions more likely to pass genes. Population gradually converges toward optimum (or local optimum).

Representation: Chromosomes and Genes

Binary Representation

Classic: solution encoded as bit string. Gene = single bit (0 or 1). Chromosome = sequence of bits. Example: 8-bit integer represented as "11010110" (214).

Real-Valued Representation

For continuous optimization: chromosome = vector of real numbers. Gene = single real value. Mutation: add Gaussian noise. Crossover: blend or swap values.

Permutation Representation

Traveling Salesman Problem (TSP): chromosome = ordering of cities. Gene = city position. Crossover: preserve city order. Mutation: swap cities.

Tree-Based Representation

Genetic programming: solution = expression tree (program). Crossover: swap subtrees. Mutation: modify subtree. Enables evolving programs.

Problem-Specific Representation

Design choice critical: representation should facilitate crossover/mutation, map naturally to problem. Poor representation → poor GA performance.

Fitness Function Design

Fitness Definition

Function assigning score to solution. Higher score = better solution. GA drives population toward high fitness.

Examples:
- Maximize f(x) = -(x-5)^2 + 10: max at x=5, fitness peaks there
- Minimize cost: fitness = 1 / (cost + epsilon) (inverse)
- Classification accuracy: fitness = percent correct predictions

Properties of Good Fitness Functions

  • Smooth landscape: nearby solutions similar fitness (enables search)
  • Clear gradient: solutions improving toward optimum have higher fitness
  • Computational efficiency: evaluated millions of times (must be fast)
  • Alignment with goal: optimizing fitness = solving problem

Fitness Shaping

Transform fitness to guide search. Example: f(x) = x^2 strong selection (big fitness differences). f(x) = sqrt(x) weak selection (smaller differences, more exploration).

Multi-Objective Fitness

Multiple objectives: scalarize (weighted sum) or Pareto-based. Pareto: maintain solutions non-dominated (can't improve one without worsening another).

Selection Strategies

Fitness-Proportional Selection

Probability of selection proportional to fitness. P(select i) = fitness(i) / sum(fitness). Fitter solutions more likely chosen as parents.

Example:
Fitness: [10, 20, 30]
Total: 60
Probabilities: [10/60, 20/60, 30/60] = [1/6, 1/3, 1/2]

Solution 3 twice as likely selected as solution 2.

Tournament Selection

Randomly sample k solutions, select fittest. Repeated to fill parent pool. Simpler, more robust than fitness-proportional.

Ranking Selection

Rank solutions by fitness, selection probability based on rank (not fitness value). Handles fitness scales well. Elite always selected.

Elitism

Preserve best solutions (copy to next generation without modification). Prevents losing good solutions, monotonic improvement.

Crossover and Recombination

One-Point Crossover

Parent 1: 11010110
Parent 2: 10101001
Split point: 4

Child 1: 1101 | 1001 = 11011001
Child 2: 1010 | 0110 = 10100110

Children combine genetic material from both parents.

Two-Point Crossover

Two split points. Middle segment swapped. Reduces schema disruption (long, useful patterns).

Uniform Crossover

Each gene independently: 50% from parent 1, 50% from parent 2. Maximum mixing.

Real-Valued Crossover

Blend crossover: child = w * parent1 + (1-w) * parent2 (weighted average). BLX-α: blend with random extension.

Purpose of Crossover

Combine good features from different parents. Exploration of solution space. If parents fitter than average, offspring likely fitter.

Mutation Operators

Bit-Flip Mutation

Randomly flip bits with probability p_m (mutation rate). Provides variation, prevents premature convergence.

Example (p_m = 1/8):
Before: 11010110
Flip bit 3: 11000110
Mutation introduces new genetic variation.

Gaussian Mutation (Real-Valued)

Add Gaussian noise to genes: x' = x + N(0, σ). Larger σ = bigger mutations.

Mutation Rate

Too low: insufficient variation, premature convergence. Too high: random search, disrupts good solutions. Typical: p_m = 1 / chromosome_length.

Adaptive Mutation

Mutation rate changes over evolution. Start high (exploration), decrease over time (exploitation). Or adjust per gene based on convergence.

Purpose of Mutation

Introduces variation preventing population stagnation. Enables escaping local optima (though GAs easily trapped). Essential for exploration.

Population Size and Convergence

Population Size Effect

Large population: diverse, slow convergence, expensive. Small population: fast but may converge prematurely. Typical: 50-500 (problem-dependent).

Number of Generations

Run until fitness plateau (no improvement over G consecutive generations) or max generations reached. Trade-off: more generations = better solutions but more computation.

Convergence Criteria

  • Maximum generations elapsed
  • Fitness unchanged for G generations (stagnation)
  • All solutions converged (same genes)
  • Time/resource limit

Convergence Behavior

Typical: rapid initial improvement, plateaus. S-shaped curve: slow start, rapid middle, asymptotic tail.

Premature Convergence and Diversity

Premature Convergence

Population converges to local optimum before finding global best. Causes: low mutation, small population, strong selection pressure.

Maintaining Diversity

  • Higher mutation: Introduces variation constantly
  • Larger population: More genetic material preserved
  • Fitness sharing: Reduce fitness of similar solutions (rewards diversity)
  • Niche techniques: Separate population into subgroups

No Free Lunch

GA excellent for some problems, poor for others. No algorithm optimal across all problems. Success depends on problem structure, representation, parameter tuning.

Robustness

GAs relatively robust: work across diverse problem classes, less sensitive to initial conditions than gradient descent. But parameter tuning important for performance.

Applications in Engineering and Design

Parameter Optimization

Tune parameters of complex system (neural networks, control systems). GA searches parameter space, improves performance.

Engineering Design

Design aircraft wings, car bodies, mechanical structures. Objectives: minimize weight, maximize strength. GA explores design space, balances trade-offs.

Scheduling

Job shop scheduling, school timetables: assign tasks to time slots. Fitness = meeting constraints, minimizing conflicts. NP-hard problem, GA finds good (not optimal) solutions.

Traveling Salesman Problem (TSP)

Find shortest tour visiting all cities. GA with permutation representation, crossover preserving order. Efficient for medium-size instances.

Machine Learning

Hyperparameter tuning: learning rate, layer sizes, regularization. GA alternative to grid/random search. Also used for feature selection.

Game AI

Evolve strategies: GA used to evolve neural network weights, game strategies, team tactics. Enables emergence of complex behavior.

Scientific Discovery

Symbolic regression: evolve mathematical expressions fitting data. Automated hypothesis generation. Genetic programming variant.

References

  • Holland, J. H. "Adaptation in Natural and Artificial Systems." University of Michigan Press, 1975.
  • Goldberg, D. E. "Genetic Algorithms in Search, Optimization, and Machine Learning." Addison-Wesley, 1989.
  • Mitchell, M. "An Introduction to Genetic Algorithms." MIT Press, 1996.
  • Koza, J. R. "Genetic Programming: On the Programming of Computers by Means of Natural Selection." MIT Press, 1992.
  • Michalewicz, Z. "Genetic Algorithms + Data Structures = Evolution Programs." Springer, 3rd edition, 1996.