Introduction
ANOVA (Analysis of Variance): statistical technique comparing means of three or more groups. Purpose: test if group means differ significantly. Developed by Ronald Fisher (1920s). Core concept: partitioning total variability into components attributable to different sources.
"ANOVA is the most powerful tool for comparing multiple means simultaneously, reducing Type I error compared to multiple t-tests." -- Ronald Fisher
ANOVA Concept
Variance Partitioning
Total variability decomposed into between-group variance and within-group variance. Between-group variance: variability due to treatment or group differences. Within-group variance: variability due to random error or individual differences.
Rationale
If between-group variance much larger than within-group variance, suggests real differences in group means. Otherwise, differences likely due to chance.
F-Ratio
Calculation: ratio of between-group mean square to within-group mean square. Large F-ratio indicates significant effect.
Types of ANOVA
One-Way ANOVA
One independent categorical variable (factor) with multiple levels. Tests effect of single factor on continuous outcome.
Two-Way ANOVA
Two independent factors. Tests main effects and interaction effects between factors.
Repeated Measures ANOVA
Same subjects measured under different conditions or times. Accounts for within-subject variability.
Multivariate ANOVA (MANOVA)
Multiple dependent variables tested simultaneously for differences across groups.
Assumptions
Independence
Observations must be independent within and across groups.
Normality
Data in each group roughly normally distributed.
Homogeneity of Variances
Variances across groups should be approximately equal (homoscedasticity).
Measurement Level
Dependent variable should be continuous and interval or ratio scale.
ANOVA Hypotheses
Null Hypothesis (H0)
All group means are equal: μ1 = μ2 = ... = μk.
Alternative Hypothesis (Ha)
At least one group mean differs from the others.
Interpretation
Rejecting H0 implies significant effect of factor on outcome variable.
ANOVA Formula and Computation
Total Sum of Squares (SST)
Measures total variability in data.
SST = ΣΣ (X_ij - X̄..)²Between-Group Sum of Squares (SSB)
Variability due to group means.
SSB = Σ n_j (X̄_j. - X̄..)²Within-Group Sum of Squares (SSW)
Variability within groups.
SSW = ΣΣ (X_ij - X̄_j.)²Mean Squares
Between-group mean square: MSB = SSB / (k - 1). Within-group mean square: MSW = SSW / (N - k).
F Statistic
F = MSB / MSW.
ANOVA Table
Organizes sums of squares, degrees of freedom, mean squares, F-value, and p-value.
| Source of Variation | Sum of Squares (SS) | Degrees of Freedom (df) | Mean Square (MS) | F | p-value |
|---|---|---|---|---|---|
| Between Groups | SSB | k - 1 | MSB = SSB / (k - 1) | F = MSB / MSW | Calculated |
| Within Groups | SSW | N - k | MSW = SSW / (N - k) | ||
| Total | SST | N - 1 |
F-Distribution
Definition
Continuous probability distribution of F-statistic under null hypothesis. Defined by two degrees of freedom: numerator (df1 = k - 1) and denominator (df2 = N - k).
Shape
Right-skewed, non-negative values. Critical values depend on df1, df2, and significance level α.
Use in ANOVA
Compare computed F to critical F: if F > F_critical, reject null hypothesis.
Post Hoc Tests
Purpose
Identify which specific means differ after rejecting null hypothesis in ANOVA.
Common Methods
Tukey's HSD, Bonferroni correction, Scheffé test, Dunnett's test. Control Type I error rate.
Selection Criteria
Depends on sample sizes, number of comparisons, assumptions.
Interpretation
Post hoc results indicate pairwise group differences with adjusted significance levels.
Advantages and Disadvantages
Advantages
- Tests multiple groups simultaneously, reducing Type I error.
- Flexible: handles multiple factors, interactions, repeated measures.
- Widely applicable across disciplines.
Disadvantages
- Assumptions must be met for valid results.
- Only tests mean differences, not specific patterns.
- Post hoc tests increase complexity and risk of Type II errors.
Applications
Experimental Research
Compare treatment effects across groups in medicine, psychology, agriculture.
Quality Control
Assess process variations and factor effects in manufacturing.
Social Sciences
Analyze survey data with categorical predictors on continuous outcomes.
Marketing
Test consumer preferences among multiple product versions.
| Field | Example Use Case |
|---|---|
| Medicine | Comparing drug efficacy among patient groups |
| Agriculture | Testing crop yields under different fertilizers |
| Psychology | Analyzing cognitive test scores by learning method |
| Marketing | Evaluating customer satisfaction across product variants |
References
- Fisher, R.A., "The Design of Experiments," Oliver & Boyd, 1935, pp. 1-50.
- Montgomery, D.C., "Design and Analysis of Experiments," 8th ed., Wiley, 2012, pp. 101-150.
- Field, A., "Discovering Statistics Using SPSS," 4th ed., Sage Publications, 2013, pp. 200-230.
- Kirk, R.E., "Experimental Design: Procedures for the Behavioral Sciences," 4th ed., Sage, 2013, pp. 75-110.
- Maxwell, S.E., Delaney, H.D., "Designing Experiments and Analyzing Data," 2nd ed., Psychology Press, 2004, pp. 120-170.