Definition of Variance
Basic Concept
Variance quantifies spread of random variable outcomes around their mean (expected value). It is the average of squared deviations from the mean, indicating dispersion magnitude.
Purpose
Measures variability, uncertainty, or risk in probabilistic and statistical models. Key for understanding distribution shape and concentration.
Terminology
Denoted as Var(X) or σ² for random variable X. Units are square of original variable units, reflecting squared deviations.
Mathematical Formulation
Variance Definition Formula
Var(X) = E[(X - E[X])²]Where E denotes expectation operator, X is the random variable.
Expanded Formula
Var(X) = E[X²] - (E[X])²Equivalently expressed as difference between second moment and square of first moment.
Notation
σ² = Var(X), σ = standard deviation (square root of variance).
Properties of Variance
Non-negativity
Variance is always ≥ 0, since it is an expectation of a squared quantity.
Variance of Constant
Var(c) = 0 for any constant c, no variability.
Scaling Property
For constant a, Var(aX) = a² Var(X), scales quadratically.
Additivity for Independent Variables
If X, Y independent, Var(X + Y) = Var(X) + Var(Y).
Variance and Covariance
Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y) for dependent variables.
Variance of Discrete Random Variables
Definition
Given discrete variable X with values x_i and probabilities p_i:
Var(X) = Σ p_i (x_i - μ)², where μ = E[X] = Σ p_i x_iCalculation Steps
- Compute mean μ
- Calculate squared deviations (x_i - μ)²
- Multiply by probabilities p_i
- Sum over all i
Example
Consider X values {1, 2, 3} with probabilities {0.2, 0.5, 0.3}:
| x_i | p_i | (x_i - μ)² | p_i (x_i - μ)² |
|---|---|---|---|
| 1 | 0.2 | (1 - 2.1)² = 1.21 | 0.2 × 1.21 = 0.242 |
| 2 | 0.5 | (2 - 2.1)² = 0.01 | 0.5 × 0.01 = 0.005 |
| 3 | 0.3 | (3 - 2.1)² = 0.81 | 0.3 × 0.81 = 0.243 |
Summation: Var(X) = 0.242 + 0.005 + 0.243 = 0.49
Variance of Continuous Random Variables
Definition
For continuous variable X with pdf f(x):
Var(X) = ∫ (x - μ)² f(x) dx, where μ = E[X] = ∫ x f(x) dxIntegration Domain
Integral over entire support of X, possibly infinite limits.
Example: Uniform Distribution
X ~ Uniform(a, b), pdf f(x) = 1/(b - a) for x in [a, b]:
Var(X) = (b - a)² / 12Example: Normal Distribution
X ~ N(μ, σ²): Variance is σ², parameter of distribution.
Variance and Expectation Relationship
Variance as Expectation of Squared Deviation
Directly derived from expectation operator applied to squared difference.
Moment Interpretation
Variance = second central moment; relates to distribution shape.
Relation to Moments
Second moment E[X²] and first moment E[X] fully determine variance.
Expectation Linearity
Variance not linear, but expectation is linear operator.
Computational Formulas and Methods
Standard Formula
Var(X) = E[X²] - (E[X])²Computational Shortcut
Calculate E[X] and E[X²] separately, then subtract squared mean.
Sample Variance
Estimator for population variance from sample data:
S² = (1/(n-1)) Σ (x_i - x̄)²Alternative Formula for Sample Variance
S² = (1/(n-1)) [Σ x_i² - n x̄²]Computational Efficiency
Useful in algorithms and statistical software to avoid numeric instability.
Applications of Variance
Risk Assessment
Quantifies uncertainty in finance, insurance, reliability engineering.
Quality Control
Monitors variation in manufacturing processes.
Statistical Inference
Basis for confidence intervals, hypothesis testing, ANOVA.
Machine Learning
Used in model evaluation metrics, bias-variance tradeoff analysis.
Signal Processing
Measures noise power and signal dispersion.
Variance in Statistical Analysis
Descriptive Statistics
Variance complements mean, median, mode to describe data.
ANOVA
Partition total variance into components to test group differences.
Regression Analysis
Explains variability of dependent variable around regression line.
Standard Deviation
Square root of variance, interpretable in original units.
Coefficient of Variation
Ratio of standard deviation to mean for relative comparison.
Limitations and Interpretation
Units Squared
Variance units are squared, complicating direct interpretation.
Sensitivity to Outliers
Large deviations disproportionately affect variance.
Non-Robustness
Not suitable for heavily skewed or non-finite variance distributions.
Alternative Measures
Use interquartile range, median absolute deviation for robustness.
Interpretation Context
High variance implies diverse outcomes; low variance implies concentration.
Worked Examples
Example 1: Discrete Random Variable
Given X with probabilities and values:
Value (x_i): 0 1 2 3Probability: 0.1 0.2 0.4 0.3 Calculate E[X]:
E[X] = 0×0.1 + 1×0.2 + 2×0.4 + 3×0.3 = 0 + 0.2 + 0.8 + 0.9 = 1.9Calculate E[X²]:
E[X²] = 0²×0.1 + 1²×0.2 + 2²×0.4 + 3²×0.3 = 0 + 0.2 + 1.6 + 2.7 = 4.5Variance:
Var(X) = 4.5 - (1.9)² = 4.5 - 3.61 = 0.89Example 2: Continuous Random Variable (Uniform)
X ~ Uniform(2, 8). Variance formula:
Var(X) = (b - a)² / 12 = (8 - 2)² / 12 = 36 / 12 = 3Example 3: Sample Variance
Sample data: {5, 7, 8, 4, 6}
Calculate mean:
x̄ = (5 + 7 + 8 + 4 + 6) / 5 = 30 / 5 = 6Calculate variance:
S² = (1/(5-1)) × [(5-6)² + (7-6)² + (8-6)² + (4-6)² + (6-6)²] = (1/4) × [1 + 1 + 4 + 4 + 0] = (1/4) × 10 = 2.5Summary
- Variance measures dispersion of random variables around their mean.
- Defined as expectation of squared deviations, Var(X) = E[(X - E[X])²].
- Properties include non-negativity, scaling, and additivity for independent variables.
- Applicable to discrete and continuous variables with respective formulas.
- Essential in risk assessment, statistics, machine learning, and quality control.
- Limitations: units squared, sensitivity to outliers, requires careful interpretation.
- Computational shortcuts improve efficiency and numerical stability.
- Variance forms the basis for many statistical inference techniques.
References
- Casella, G., & Berger, R. L. (2002). Statistical Inference. Duxbury, 2nd edition.
- Ross, S. (2014). Introduction to Probability Models. Academic Press, 11th edition.
- Mood, A. M., Graybill, F. A., & Boes, D. C. (1974). Introduction to the Theory of Statistics. McGraw-Hill.
- Rice, J. A. (2007). Mathematical Statistics and Data Analysis. Cengage Learning, 3rd edition.
- Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer.