Introduction

ANOVA (Analysis of Variance): statistical technique comparing means of three or more groups. Purpose: test if group means differ significantly. Developed by Ronald Fisher (1920s). Core concept: partitioning total variability into components attributable to different sources.

"ANOVA is the most powerful tool for comparing multiple means simultaneously, reducing Type I error compared to multiple t-tests." -- Ronald Fisher

ANOVA Concept

Variance Partitioning

Total variability decomposed into between-group variance and within-group variance. Between-group variance: variability due to treatment or group differences. Within-group variance: variability due to random error or individual differences.

Rationale

If between-group variance much larger than within-group variance, suggests real differences in group means. Otherwise, differences likely due to chance.

F-Ratio

Calculation: ratio of between-group mean square to within-group mean square. Large F-ratio indicates significant effect.

Types of ANOVA

One-Way ANOVA

One independent categorical variable (factor) with multiple levels. Tests effect of single factor on continuous outcome.

Two-Way ANOVA

Two independent factors. Tests main effects and interaction effects between factors.

Repeated Measures ANOVA

Same subjects measured under different conditions or times. Accounts for within-subject variability.

Multivariate ANOVA (MANOVA)

Multiple dependent variables tested simultaneously for differences across groups.

Assumptions

Independence

Observations must be independent within and across groups.

Normality

Data in each group roughly normally distributed.

Homogeneity of Variances

Variances across groups should be approximately equal (homoscedasticity).

Measurement Level

Dependent variable should be continuous and interval or ratio scale.

ANOVA Hypotheses

Null Hypothesis (H0)

All group means are equal: μ1 = μ2 = ... = μk.

Alternative Hypothesis (Ha)

At least one group mean differs from the others.

Interpretation

Rejecting H0 implies significant effect of factor on outcome variable.

ANOVA Formula and Computation

Total Sum of Squares (SST)

Measures total variability in data.

SST = ΣΣ (X_ij - X̄..)²

Between-Group Sum of Squares (SSB)

Variability due to group means.

SSB = Σ n_j (X̄_j. - X̄..)²

Within-Group Sum of Squares (SSW)

Variability within groups.

SSW = ΣΣ (X_ij - X̄_j.)²

Mean Squares

Between-group mean square: MSB = SSB / (k - 1). Within-group mean square: MSW = SSW / (N - k).

F Statistic

F = MSB / MSW.

ANOVA Table

Organizes sums of squares, degrees of freedom, mean squares, F-value, and p-value.

Source of VariationSum of Squares (SS)Degrees of Freedom (df)Mean Square (MS)Fp-value
Between GroupsSSBk - 1MSB = SSB / (k - 1)F = MSB / MSWCalculated
Within GroupsSSWN - kMSW = SSW / (N - k)
TotalSSTN - 1

F-Distribution

Definition

Continuous probability distribution of F-statistic under null hypothesis. Defined by two degrees of freedom: numerator (df1 = k - 1) and denominator (df2 = N - k).

Shape

Right-skewed, non-negative values. Critical values depend on df1, df2, and significance level α.

Use in ANOVA

Compare computed F to critical F: if F > F_critical, reject null hypothesis.

Post Hoc Tests

Purpose

Identify which specific means differ after rejecting null hypothesis in ANOVA.

Common Methods

Tukey's HSD, Bonferroni correction, Scheffé test, Dunnett's test. Control Type I error rate.

Selection Criteria

Depends on sample sizes, number of comparisons, assumptions.

Interpretation

Post hoc results indicate pairwise group differences with adjusted significance levels.

Advantages and Disadvantages

Advantages

  • Tests multiple groups simultaneously, reducing Type I error.
  • Flexible: handles multiple factors, interactions, repeated measures.
  • Widely applicable across disciplines.

Disadvantages

  • Assumptions must be met for valid results.
  • Only tests mean differences, not specific patterns.
  • Post hoc tests increase complexity and risk of Type II errors.

Applications

Experimental Research

Compare treatment effects across groups in medicine, psychology, agriculture.

Quality Control

Assess process variations and factor effects in manufacturing.

Social Sciences

Analyze survey data with categorical predictors on continuous outcomes.

Marketing

Test consumer preferences among multiple product versions.

FieldExample Use Case
MedicineComparing drug efficacy among patient groups
AgricultureTesting crop yields under different fertilizers
PsychologyAnalyzing cognitive test scores by learning method
MarketingEvaluating customer satisfaction across product variants

References

  • Fisher, R.A., "The Design of Experiments," Oliver & Boyd, 1935, pp. 1-50.
  • Montgomery, D.C., "Design and Analysis of Experiments," 8th ed., Wiley, 2012, pp. 101-150.
  • Field, A., "Discovering Statistics Using SPSS," 4th ed., Sage Publications, 2013, pp. 200-230.
  • Kirk, R.E., "Experimental Design: Procedures for the Behavioral Sciences," 4th ed., Sage, 2013, pp. 75-110.
  • Maxwell, S.E., Delaney, H.D., "Designing Experiments and Analyzing Data," 2nd ed., Psychology Press, 2004, pp. 120-170.