Definition and Basic Properties
Formal Definition
Cumulative Distribution Function (CDF) of a random variable X: function F_X(x) = P(X ≤ x), mapping real numbers to [0,1]. Represents total probability mass or area up to point x.
Domain and Range
Domain: x ∈ ℝ. Range: F_X(x) ∈ [0,1]. Non-decreasing function.
Interpretation
Measures likelihood that the variable takes a value less than or equal to x. Encapsulates entire distribution information.
F_X(x) = P(X ≤ x) = { ∑_{t ≤ x} P(X = t), for discrete X ∫_{-∞}^x f_X(t) dt, for continuous X } Monotonicity
Non-decreasing: if a ≤ b, then F_X(a) ≤ F_X(b). Reflects cumulative probability accumulation.
Limits at Infinity
F_X(-∞) = 0; F_X(∞) = 1. Ensures total probability equals 1.
Types of Cumulative Distribution Functions
Discrete CDFs
Defined by step functions. Jumps at points where P(X=x) > 0. Constant elsewhere.
Continuous CDFs
Continuous and often differentiable. Derivative equals probability density function (PDF).
Mixed Distributions
Combination of discrete and continuous parts. CDF has jumps and continuous sections.
Examples
Bernoulli, Poisson (discrete). Normal, Exponential (continuous). Mixed: distributions with atoms and densities.
Tabular Example
| Random Variable | Type | CDF Characteristics |
|---|---|---|
| Poisson(λ) | Discrete | Step function with jumps at integers |
| Normal(μ, σ²) | Continuous | Smooth, strictly increasing, differentiable |
| Mixed Distribution | Mixed | Combination of jumps and smooth parts |
Key Properties
Non-Decreasing Nature
F_X is monotone non-decreasing. Ensures meaningful cumulative probability interpretation.
Right-Continuity
F_X is right-continuous: lim_{h→0+} F_X(x+h) = F_X(x). Essential for probability measure consistency.
Bounds
0 ≤ F_X(x) ≤ 1 for all x ∈ ℝ.
Limits
lim_{x→-∞} F_X(x) = 0; lim_{x→∞} F_X(x) = 1.
Uniqueness
CDF uniquely determines the distribution of X. Two variables with identical CDFs have the same distribution.
Relationship with Probability Density and Mass Functions
Discrete Case: PMF
Probability Mass Function (PMF) p_X(x) relates to CDF by jumps:
p_X(x) = F_X(x) - lim_{t→x^-} F_X(t) Continuous Case: PDF
Probability Density Function (PDF) f_X(x) is derivative of CDF where differentiable:
f_X(x) = dF_X(x) / dx Mixed Case
CDF has both jump discontinuities and continuous parts. PDF defined almost everywhere except points of jumps.
Integral Representation
CDF constructed as integral of PDF:
F_X(x) = ∫_{-∞}^x f_X(t) dt Summary Table
| Distribution Type | CDF Behavior | Derivative or Difference |
|---|---|---|
| Discrete | Step function | PMF = jump size |
| Continuous | Continuous, differentiable | PDF = derivative |
| Mixed | Jumps + continuous | Combination |
Computation and Examples
Discrete Example: Bernoulli
Random variable X with P(X=1)=p, P(X=0)=1-p. CDF:
F_X(x) = 0, x < 0 1 - p, 0 ≤ x < 1 1, x ≥ 1 Continuous Example: Uniform(0,1)
CDF is linear in support interval:
F_X(x) = 0, x < 0 x, 0 ≤ x ≤ 1 1, x > 1 Computational Methods
Numerical integration for continuous distributions without closed form. Summation for discrete.
Software Implementations
Functions available in R (pnorm, pbinom), Python (scipy.stats.cdf), MATLAB (cdf).
Example Plot Description
Plotting CDF shows step jumps for discrete, smooth curves for continuous distributions.
Applications in Probability and Statistics
Probability Computation
Calculate probabilities for intervals: P(a < X ≤ b) = F_X(b) - F_X(a).
Statistical Inference
Used in hypothesis testing, confidence interval construction.
Simulation
Inverse transform method uses inverse CDF to generate random samples.
Risk Assessment
Modeling cumulative probabilities of losses or events up to thresholds.
Reliability Engineering
Failure time distributions analyzed via CDFs.
Continuity and Discontinuity
Right-Continuity Defined
F_X is continuous from the right at every x: lim_{h→0+} F_X(x + h) = F_X(x).
Points of Discontinuity
Discrete distributions have jump discontinuities at points with positive mass.
Implications
Jumps correspond to probabilities of exact values in discrete variables.
Continuous Distributions
No jumps; CDF continuous everywhere.
Mixed Cases
Combine continuous intervals and jump points.
Inverse CDF and Quantile Function
Definition
Inverse CDF or quantile function Q(p) = inf{x: F_X(x) ≥ p} for p ∈ [0,1].
Usage
Generates random variables from uniform samples (inverse transform sampling).
Properties
Non-decreasing, right-continuous. Exists for all distributions.
Examples
Uniform(0,1): Q(p) = p; Normal distribution quantiles via numerical methods.
Computational Algorithms
Numerical inversion methods: bisection, Newton-Raphson.
Multivariate Cumulative Distribution Functions
Definition
CDF of vector X = (X_1, ..., X_n): F_X(x_1, ..., x_n) = P(X_1 ≤ x_1, ..., X_n ≤ x_n).
Properties
Non-decreasing in each argument, right-continuous, limits approach zero or one at infinities.
Marginal Distributions
Obtained by fixing variables at infinity; e.g., F_{X_1}(x_1) = lim_{x_2→∞} ... F_X(x_1, x_2, ...).
Copulas
Functions that join univariate marginals to form multivariate CDFs. Capture dependence structure.
Applications
Modeling joint risks, multivariate statistical inference, dependence analysis.
Estimation from Data
Empirical CDF
Defined as F_n(x) = (1/n) ∑_{i=1}^n I_{X_i ≤ x} using sample data. Step function with jumps at data points.
Properties
Non-decreasing, right-continuous, converges uniformly to true CDF (Glivenko-Cantelli theorem).
Smoothing Techniques
Kernel smoothing applied to empirical CDF to estimate continuous distributions.
Confidence Bands
Dvoretzky–Kiefer–Wolfowitz inequality provides bounds on estimation error.
Practical Considerations
Sample size, data quality impact accuracy of empirical CDF.
Limitations and Considerations
Non-uniqueness of PDF
Different distributions can share same CDF if defined on null sets.
Discontinuities in Mixed Distributions
Complicate differentiation and inversion procedures.
Numerical Challenges
Computing CDFs without closed form requires approximation, may induce errors.
Multivariate Complexity
Higher dimensions increase computational and interpretive difficulty.
Interpretation Nuances
Requires understanding of underlying random variable type (discrete, continuous, mixed).
Advanced Topics and Extensions
Generalized CDFs
Extensions to random elements in abstract spaces, distribution functions on metric spaces.
Stochastic Processes
CDFs for processes describe joint distributions over time or indices.
Conditional CDFs
Define distributions conditional on events or sigma-algebras. Basis for Bayesian inference.
Order Statistics
CDFs of sorted samples: describe distribution of minima, maxima, medians.
Extreme Value Theory
Study of limiting distributions for maxima/minima, involves specific CDF forms.
References
- Feller, W. "An Introduction to Probability Theory and Its Applications", Vol. 2, Wiley, 1971, pp. 185-220.
- Casella, G. and Berger, R. L. "Statistical Inference", Duxbury, 2002, pp. 120-150.
- Billingsley, P. "Probability and Measure", 3rd ed., Wiley, 1995, pp. 230-260.
- Durrett, R. "Probability: Theory and Examples", 4th ed., Cambridge University Press, 2010, pp. 90-115.
- Shorack, G. R. and Wellner, J. A. "Empirical Processes with Applications to Statistics", SIAM, 1986, pp. 50-85.