Definition and Properties

Overview

Normal distribution: continuous, symmetric, unimodal. Shape: bell curve. Defined by mean (μ) and variance (σ²). Support: all real numbers (-∞, ∞). Model: natural variation, measurement errors, biological traits.

Mathematical Definition

Random variable X is normal if its pdf matches the Gaussian formula. Properties: symmetry about mean, mode = median = mean, infinite support, tails asymptotically approach zero.

Key Properties

Unimodality: single peak at mean μ. Symmetry: f(μ - x) = f(μ + x). Moments: all moments finite. Characteristic: max entropy for given mean and variance.

Probability Density Function and CDF

Probability Density Function (PDF)

PDF formula:

f(x) = (1 / (σ √(2π))) * exp(-(x - μ)² / (2σ²))

Interpretation: height of curve at x; area under curve between points = probability.

Cumulative Distribution Function (CDF)

CDF: probability X ≤ x. No closed form, uses error function erf. Expressed as:

F(x) = 0.5 * [1 + erf((x - μ) / (σ √2))]

Properties of PDF and CDF

PDF integrates to 1. CDF non-decreasing, continuous, limits 0 at -∞, 1 at ∞. Symmetry: F(μ + a) = 1 - F(μ - a).

PropertyDescription
Support(-∞, ∞)
SymmetryAbout mean μ
Mean, Median, ModeAll equal to μ
Varianceσ²

Parameters: Mean and Variance

Mean (μ)

Location parameter. Center of distribution. Determines peak position. Estimate: sample average.

Variance (σ²)

Scale parameter. Measures spread, dispersion. Controls width of bell curve. Estimate: sample variance.

Higher Moments

Skewness = 0 (symmetry). Kurtosis = 3 (mesokurtic). Moments beyond variance describe shape deviations; normal is baseline.

Standard Normal Distribution

Definition

Mean μ = 0, variance σ² = 1. Denoted Z ~ N(0,1). Basis for normalization and reference tables.

Standard Normal PDF and CDF

PDF:

ϕ(z) = (1 / √(2π)) * exp(-z² / 2)

CDF:

Φ(z) = ∫ from -∞ to z of ϕ(t) dt

Use in Statistical Tables

Tabulated Φ(z) values enable probability calculations. Software functions implement Φ and inverse Φ for quantiles.

Z-scores and Standardization

Definition

Z-score: number of standard deviations a value is from mean.

z = (x - μ) / σ

Purpose

Normalize data to standard normal. Compare values across different scales. Facilitate probability and hypothesis testing.

Example

Value x=80, μ=70, σ=5. z = (80-70)/5 = 2. Interpretation: x is 2 SD above mean.

Central Limit Theorem

Statement

Sum (or average) of large number of i.i.d. variables tends towards normal distribution, regardless of original distribution.

Conditions

Variables independent, identically distributed, finite mean and variance. Sample size sufficiently large (n ≥ 30 common rule).

Implications

Justifies normal models in statistics. Basis for confidence intervals, hypothesis testing, parametric inference.

Moment Generating Functions

Definition

MGF M(t) = E[e^(tX)]. For normal:

M(t) = exp(μt + (σ² t²)/2)

Properties

MGF exists for all real t. Moments found by derivatives at t=0. Uniqueness: MGF determines distribution uniquely.

Characteristic Function

φ(t) = E[e^(itX)] = exp(iμt - (σ² t²)/2). Useful in theory and proofs.

Applications in Statistics and Science

Statistical Modeling

Model measurement errors, test statistics, regression residuals. Foundation of parametric statistical methods.

Natural Sciences

Model height, IQ scores, blood pressure, noise. Describes phenomena with aggregate independent effects.

Engineering and Finance

Signal processing noise, risk analysis, asset returns approximation. Provides tractable analytical tools.

Sampling and Estimation

Sampling Distribution

Sample mean of normal samples also normal. Variance decreases with sample size: σ²/n.

Confidence Intervals

Use normal quantiles to construct interval estimates of μ when σ known or large samples.

Parameter Estimation

Maximum likelihood estimators for μ and σ²: sample mean and sample variance.

EstimatorFormulaProperties
Mean (μ̂)(1/n) ∑ xᵢUnbiased, consistent
Variance (σ̂²)(1/n) ∑ (xᵢ - μ̂)²Biased, consistent; unbiased: divide by (n-1)

Multivariate Normal Distribution

Definition

Vector X = (X₁, ..., Xₖ) jointly normal if any linear combination is normal.

Parameters

Mean vector μ (k×1), covariance matrix Σ (k×k, symmetric positive definite).

PDF Formula

f(x) = (1 / ((2π)^(k/2) |Σ|^(1/2))) * exp(-0.5 (x - μ)' Σ⁻¹ (x - μ))

Applications

Multivariate modeling, principal component analysis, Bayesian inference, pattern recognition.

Limitations and Assumptions

Assumption of Normality

Many methods require normality; real data may deviate (skewness, kurtosis). Check with tests, plots.

Sensitivity to Outliers

Normal distribution sensitive to extreme values; robust alternatives sometimes preferred.

Non-Negative Data

Not suitable for strictly positive data (e.g., waiting times), use log-normal or gamma instead.

Computation and Numerical Techniques

Evaluating CDF

No elementary closed form; use numerical integration, error function approximation, polynomial expansions.

Inverse CDF (Quantile Function)

Essential for simulations, hypothesis testing. Computed via rational approximations or iterative methods.

Random Number Generation

Methods: Box-Muller transform, Marsaglia polar method, ziggurat algorithm. Generate standard normal variates efficiently.

References

  • Feller, W. An Introduction to Probability Theory and Its Applications, Vol. 2, Wiley, 1971, pp. 181-214.
  • Casella, G., Berger, R. L. Statistical Inference, 2nd ed., Duxbury, 2002, pp. 243-290.
  • Billingsley, P. Probability and Measure, 3rd ed., Wiley, 1995, pp. 326-340.
  • Johnson, N. L., Kotz, S., Balakrishnan, N. Continuous Univariate Distributions, Vol. 1, Wiley, 1994, pp. 15-45.
  • Lehmann, E. L., Romano, J. P. Testing Statistical Hypotheses, 3rd ed., Springer, 2005, pp. 100-120.