Definition of Variance

Concept

Variance quantifies the average squared deviation of data points from the mean. It reflects data spread and dispersion within a dataset.

Mathematical Meaning

Variance measures how far each number in the set is from the mean and thus from every other number in the set.

Context in Descriptive Statistics

Variance is a primary measure of dispersion used alongside mean, median, and mode to describe data distribution characteristics.

Variance Formulas

Population Variance Formula

Population variance (σ²) calculates variance using all data points in the population.

σ² = (1/N) ∑ (xᵢ - μ)²

Sample Variance Formula

Sample variance (s²) estimates population variance from a sample, applying Bessel’s correction.

s² = (1/(n-1)) ∑ (xᵢ - x̄)²

Symbols Explained

σ²: population variance, s²: sample variance, N: population size, n: sample size, xᵢ: data point, μ: population mean, x̄: sample mean.

Population vs Sample Variance

Population Variance

Measures true variance of entire population. Uses divisor N. Exact measure if population data known.

Sample Variance

Estimates population variance from sample data. Uses divisor n-1 to correct bias (Bessel’s correction).

Bias and Unbiased Estimation

Sample variance is unbiased estimator of population variance; dividing by n underestimates variance.

Properties of Variance

Non-negativity

Variance is always ≥ 0. Zero variance indicates all data points identical.

Units

Variance units are square of original data units (e.g., meters²). Limits direct interpretability.

Effect of Linear Transformations

Variance scales by square of multiplicative constant: Var(aX + b) = a² Var(X).

Calculation Methods

Direct Method

Calculate mean, then average squared deviations from mean.

Computational Shortcut

Use formula: Var = (∑xᵢ² - (∑xᵢ)²/n) / (n-1) for samples to reduce rounding errors.

Software and Tools

Statistical software (R, SPSS, Python) computes variance automatically with built-in functions.

MethodDescription
Direct MethodCalculate mean, then squared deviations, then average
Computational ShortcutUse sum of squares and sum of values to avoid repeated subtraction

Interpretation of Variance

Measure of Dispersion

High variance: data widely spread; low variance: data clustered near mean.

Contextual Meaning

Variance must be interpreted relative to mean and units; raw magnitude alone insufficient.

Relation to Data Consistency

Lower variance implies higher consistency or reliability in data values.

Applications of Variance

Statistical Modeling

Variance essential in regression analysis, ANOVA, hypothesis testing, and probability distributions.

Risk Assessment

Finance uses variance to measure volatility and risk of investment returns.

Quality Control

Variance monitors process variability and product consistency in manufacturing.

Limitations of Variance

Units Squared

Variance units squared reduce intuitive interpretability compared to original units.

Sensitivity to Outliers

Variance disproportionately affected by extreme values due to squaring deviations.

Non-Robustness

Not appropriate for highly skewed distributions without transformation or robust alternatives.

Variance vs Standard Deviation

Definition

Variance: average squared deviation. Standard deviation: square root of variance.

Units Comparison

Standard deviation shares same units as data; variance units squared.

Preferred Usage

Standard deviation preferred for interpretation; variance used in theoretical derivations.

Computational Formula

Sample Variance Shortcut Formula

s² = [ ∑xᵢ² - ( (∑xᵢ)² / n ) ] / (n - 1)

Derivation

Reduces computational steps by avoiding repeated subtraction of mean.

Implementation Example

Calculate sums and sums of squares, then apply formula for efficient variance calculation.

Worked Examples

Example 1: Population Variance

Data: 2, 4, 6, 8, 10; N=5; μ=6.

Variance: σ² = (1/5)[(2-6)² + (4-6)² + (6-6)² + (8-6)² + (10-6)²] = (1/5)(16 + 4 + 0 + 4 + 16) = 8.

Example 2: Sample Variance

Data: 3, 7, 7, 19; n=4; x̄=9.

Variance: s² = (1/3)[(3-9)² + (7-9)² + (7-9)² + (19-9)²] = (1/3)(36 + 4 + 4 + 100) = 48.

Value (xᵢ)Deviation (xᵢ - x̄)Squared Deviation
3-636
7-24
7-24
1910100

References

  • Wackerly, D., Mendenhall, W., & Scheaffer, R. L. "Mathematical Statistics with Applications," 7th ed., Brooks/Cole, 2008, pp. 124-150.
  • Rice, J. A. "Mathematical Statistics and Data Analysis," 3rd ed., Cengage Learning, 2006, pp. 60-85.
  • Moore, D. S., McCabe, G. P., & Craig, B. A. "Introduction to the Practice of Statistics," 7th ed., W. H. Freeman, 2012, pp. 110-130.
  • DeGroot, M. H., & Schervish, M. J. "Probability and Statistics," 4th ed., Addison-Wesley, 2012, pp. 95-120.
  • Casella, G., & Berger, R. L. "Statistical Inference," 2nd ed., Duxbury, 2002, pp. 200-225.