Definition
Random Variables and Expectation
Random variables X, Y: measurable functions from sample space to real numbers. Expectation operator E[·]: weighted average over probability space. Joint expectation E[XY]: expectation of product.
Covariance Formula
Covariance of X and Y, denoted Cov(X, Y), defined as:
Cov(X, Y) = E[(X - E[X]) (Y - E[Y])] = E[XY] - E[X] E[Y] Existence Conditions
Covariance exists if E[X], E[Y], and E[XY] exist (finite). Requires finite second moments.
Interpretation
Measure of Joint Variability
Covariance quantifies linear joint variability. Positive: variables increase together. Negative: one increases, other decreases.
Indicator of Dependence
Nonzero covariance implies linear dependence. Zero covariance does NOT imply independence except in Gaussian case.
Scale Dependence
Covariance magnitude depends on units and scale of variables. Not standardized.
Properties
Symmetry
Cov(X, Y) = Cov(Y, X).
Linearity in Each Argument
Cov(aX + b, Y) = a Cov(X, Y), for constants a, b.
Relation to Variance
Cov(X, X) = Var(X), variance of X.
Bounds
By Cauchy-Schwarz inequality: |Cov(X, Y)| ≤ √(Var(X) Var(Y)).
Calculation Methods
Analytical Computation
Given joint pdf or pmf, evaluate integrals or sums for E[X], E[Y], E[XY].
Using Data Samples
Estimate using sample averages. See Sample Covariance section.
Computational Algorithms
Online algorithms update covariance incrementally for streaming data.
Update formulas:Let n = number of samples,Mean_X_n = previous mean,Cov_n = previous covariance,New value (x, y):Mean_X_{n+1} = Mean_X_n + (x - Mean_X_n)/(n+1)Cov_{n+1} = Cov_n + ((x - Mean_X_n)*(y - Mean_Y_n) - Cov_n)/(n+1) Relationship with Variance
Variance as Special Case
Variance measures spread of one variable: Var(X) = Cov(X, X).
Variance Decomposition
Variance of sum: Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X, Y).
Covariance as Generalization
Captures joint variability beyond individual variances.
Covariance Matrix
Definition
For vector random variable X = (X₁, ..., Xₙ), covariance matrix Σ with entries Σ_ij = Cov(X_i, X_j).
Properties
Matrix Σ is symmetric, positive semi-definite.
Use in Multivariate Statistics
Describes joint variability of multiple variables; key in PCA, multivariate normal distribution.
| Entry | Formula |
|---|---|
| Diagonal | Cov(X_i, X_i) = Var(X_i) |
| Off-diagonal | Cov(X_i, X_j), i ≠ j |
Covariance and Correlation
Definition of Correlation
Correlation coefficient ρ(X, Y) = Cov(X, Y) / (σ_X σ_Y), where σ_X = √Var(X).
Standardization
Correlation ranges from -1 to 1, dimensionless and scale-invariant.
Interpretation Differences
Covariance magnitude affected by scale; correlation provides normalized measure of linear association.
| Measure | Range | Scale Dependence |
|---|---|---|
| Covariance | (-∞, +∞) | Depends on variable units |
| Correlation | [-1, 1] | Unitless, normalized |
Applications
Statistics and Data Analysis
Detect linear relationships between variables. Build predictive models.
Finance
Portfolio theory: covariance quantifies asset co-movements; risk diversification.
Machine Learning
Feature selection, dimensionality reduction (PCA uses covariance matrix).
Signal Processing
Noise analysis, system identification.
Sample Covariance Estimation
Sample Covariance Formula
Given data {(x_i, y_i)} for i=1..n, sample covariance S_XY:
S_{XY} = (1/(n-1)) ∑_{i=1}^n (x_i - x̄)(y_i - ȳ) Unbiased Estimator
Dividing by n-1 provides unbiased estimator of population covariance.
Computational Considerations
Numerical stability improved using two-pass algorithm.
Two-pass algorithm:1. Compute means x̄, ȳ.2. Compute sum of products (x_i - x̄)(y_i - ȳ).3. Divide by (n-1). Limitations
Only Measures Linear Dependence
Nonlinear relationships may have zero covariance.
Scale Sensitivity
Interpretation complicated by units and variable scales.
Zero Covariance ≠ Independence
Variables may be dependent but uncorrelated.
Outlier Influence
Covariance sensitive to extreme values.
Examples
Simple Discrete Case
X, Y take values {1,2}, joint pmf uniform:
Values: (X, Y) = (1,1), (1,2), (2,1), (2,2)P = 0.25 eachE[X] = 1.5, E[Y] = 1.5E[XY] = (1*1 + 1*2 + 2*1 + 2*2)*0.25 = (1 + 2 + 2 + 4)*0.25 = 9*0.25 = 2.25Cov(X, Y) = E[XY] - E[X]E[Y] = 2.25 - (1.5)(1.5) = 2.25 - 2.25 = 0 Continuous Case: Bivariate Normal
Covariance parameter σ₁₂ defines linear dependence strength in joint pdf.
Sample Data Example
Data points: (2,3), (4,5), (6,7).
x̄ = (2+4+6)/3 = 4ȳ = (3+5+7)/3 = 5S_{XY} = (1/2)[(2-4)(3-5) + (4-4)(5-5) + (6-4)(7-5)] = (1/2)[(-2)(-2) + 0 + 2*2] = (1/2)(4 + 0 + 4) = 4 References
- Anderson, T.W., An Introduction to Multivariate Statistical Analysis, Wiley, 2003, pp. 25-40.
- Casella, G., Berger, R.L., Statistical Inference, Duxbury, 2002, pp. 235-240.
- Graybill, F.A., Introduction to Matrices with Applications in Statistics, Wadsworth, 1983, pp. 67-75.
- Johnson, R.A., Wichern, D.W., Applied Multivariate Statistical Analysis, Pearson, 2014, pp. 50-60.
- Wasserman, L., All of Statistics: A Concise Course in Statistical Inference, Springer, 2004, pp. 90-95.