Definition
Concept
Standard deviation (SD): statistical metric quantifying spread or dispersion of numeric data relative to mean. Units: same as original data. Purpose: measure variability, consistency, risk.
Data Distribution Context
Applicable for numerical datasets. Used with normal, skewed, or unknown distributions. Complements mean and median for comprehensive analysis.
Role in Descriptive Statistics
Summarizes data spread succinctly. Helps detect outliers, assess homogeneity, compare datasets.
Historical Background
Origins
Concept emerged in 19th century with development of probability and statistics. Introduced by Karl Pearson (1894) as square root of variance.
Evolution
Refined over decades. Incorporated in inferential statistics, quality control, finance.
Modern Usage
Foundational tool in statistics, machine learning, experimental sciences, economics.
Mathematical Formula
Population Standard Deviation
σ = √( (1/N) ∑ᵢ=1ⁿ (xᵢ - μ)² )Where: σ = population standard deviation, N = population size, xᵢ = individual data points, μ = population mean.
Sample Standard Deviation
s = √( (1/(n-1)) ∑ᵢ=1ⁿ (xᵢ - x̄)² )Where: s = sample standard deviation, n = sample size, x̄ = sample mean.
Notation Summary
σ denotes population SD; s denotes sample SD. Denominator differs: N for population, n-1 (Bessel’s correction) for sample.
Calculation Methods
Step-by-Step Calculation
- Compute mean (μ or x̄).
- Calculate deviations (xᵢ - mean).
- Square deviations.
- Sum squared deviations.
- Divide by N (population) or n-1 (sample).
- Take square root of result.
Computational Formulas
Variance (σ² or s²) = (Sum of squares) / (N or n-1)Standard deviation = √VarianceUsing Software and Calculators
Commonly calculated via statistical software (R, SPSS, Excel). Functions: STDEV.P (population), STDEV.S (sample) in Excel.
Interpretation
Magnitude Meaning
Higher SD: greater variability, data more spread out. Lower SD: data clustered near mean.
Relation to Data Distribution
In normal distributions, ~68% data within ±1 SD, ~95% within ±2 SD, ~99.7% within ±3 SD (empirical rule).
Contextual Importance
Interpret relative to scale of data. Absolute SD meaningful only alongside mean or range.
Relationship to Variance
Definition of Variance
Variance: average squared deviation from mean. Formula similar to SD but without square root.
Mathematical Link
Standard deviation = square root of variance. Units of variance: squared units; units of SD: original units.
Practical Implications
Variance useful in theoretical contexts; SD preferred for interpretability and reporting.
Population vs Sample Standard Deviation
Population SD
Calculates spread for entire population data. Uses N in denominator. True parameter if population data complete.
Sample SD
Estimates population SD from subset. Uses n-1 (Bessel’s correction) to reduce bias.
When to Use Which
Population SD when full data available; sample SD for inferential statistics and estimation.
Properties
Non-Negativity
SD always ≥ 0. Zero if all data points identical.
Effect of Linear Transformations
SD scales linearly with multiplication: SD(aX + b) = |a| × SD(X).
Robustness
Not robust to outliers; sensitive to extreme values.
Applications
Descriptive Statistics
Summarizes data spread for research, reports, surveys.
Quality Control
Monitors process variability, identifies deviations from standards.
Finance
Measures volatility of asset returns, risk assessment.
Experimental Sciences
Quantifies measurement precision, variability in experiments.
Limitations
Sensitivity to Outliers
Outliers inflate SD, potentially misleading variability assessment.
Assumption of Interval Data
Meaningful only for interval or ratio scales; not for nominal or ordinal data.
Misinterpretation Risks
High SD not always negative; context-dependent. Requires complementary statistics.
Examples
Example 1: Population SD Calculation
Data: 2, 4, 4, 4, 5, 5, 7, 9
| Step | Value |
|---|---|
| Mean (μ) | 5 |
| Squared deviations sum | 32 |
| Variance (σ²) | 32 / 8 = 4 |
| Standard deviation (σ) | √4 = 2 |
Example 2: Sample SD Calculation
Data sample: 10, 12, 23, 23, 16, 23, 21, 16
| Step | Value |
|---|---|
| Mean (x̄) | 18 |
| Squared deviations sum | 168 |
| Variance (s²) | 168 / (8-1) = 24 |
| Standard deviation (s) | √24 ≈ 4.90 |
References
- Moore, D. S., McCabe, G. P., & Craig, B. A. Introduction to the Practice of Statistics. W. H. Freeman, 2017, pp. 100-120.
- Wackerly, D. D., Mendenhall, W., & Scheaffer, R. L. Mathematical Statistics with Applications. Cengage Learning, 2014, vol. 7, pp. 45-75.
- DeGroot, M. H., & Schervish, M. J. Probability and Statistics. Addison-Wesley, 2012, vol. 4, pp. 210-235.
- Hogg, R. V., McKean, J., & Craig, A. T. Introduction to Mathematical Statistics. Pearson, 2019, vol. 8, pp. 185-220.
- Rice, J. A. Mathematical Statistics and Data Analysis. Cengage Learning, 2006, vol. 3, pp. 50-80.