!main_tags!Cumulative Distribution - probability | What's Your IQ !main_header!

Definition and Basic Properties

Formal Definition

Cumulative Distribution Function (CDF) of a random variable X: function F_X(x) = P(X ≤ x), mapping real numbers to [0,1]. Represents total probability mass or area up to point x.

Domain and Range

Domain: x ∈ ℝ. Range: F_X(x) ∈ [0,1]. Non-decreasing function.

Interpretation

Measures likelihood that the variable takes a value less than or equal to x. Encapsulates entire distribution information.

F_X(x) = P(X ≤ x) = { ∑_{t ≤ x} P(X = t), for discrete X ∫_{-∞}^x f_X(t) dt, for continuous X }

Monotonicity

Non-decreasing: if a ≤ b, then F_X(a) ≤ F_X(b). Reflects cumulative probability accumulation.

Limits at Infinity

F_X(-∞) = 0; F_X(∞) = 1. Ensures total probability equals 1.

Types of Cumulative Distribution Functions

Discrete CDFs

Defined by step functions. Jumps at points where P(X=x) > 0. Constant elsewhere.

Continuous CDFs

Continuous and often differentiable. Derivative equals probability density function (PDF).

Mixed Distributions

Combination of discrete and continuous parts. CDF has jumps and continuous sections.

Examples

Bernoulli, Poisson (discrete). Normal, Exponential (continuous). Mixed: distributions with atoms and densities.

Tabular Example

Random Variable Type CDF Characteristics
Poisson(λ) Discrete Step function with jumps at integers
Normal(μ, σ²) Continuous Smooth, strictly increasing, differentiable
Mixed Distribution Mixed Combination of jumps and smooth parts

Key Properties

Non-Decreasing Nature

F_X is monotone non-decreasing. Ensures meaningful cumulative probability interpretation.

Right-Continuity

F_X is right-continuous: lim_{h→0+} F_X(x+h) = F_X(x). Essential for probability measure consistency.

Bounds

0 ≤ F_X(x) ≤ 1 for all x ∈ ℝ.

Limits

lim_{x→-∞} F_X(x) = 0; lim_{x→∞} F_X(x) = 1.

Uniqueness

CDF uniquely determines the distribution of X. Two variables with identical CDFs have the same distribution.

Relationship with Probability Density and Mass Functions

Discrete Case: PMF

Probability Mass Function (PMF) p_X(x) relates to CDF by jumps:

p_X(x) = F_X(x) - lim_{t→x^-} F_X(t)

Continuous Case: PDF

Probability Density Function (PDF) f_X(x) is derivative of CDF where differentiable:

f_X(x) = dF_X(x) / dx

Mixed Case

CDF has both jump discontinuities and continuous parts. PDF defined almost everywhere except points of jumps.

Integral Representation

CDF constructed as integral of PDF:

F_X(x) = ∫_{-∞}^x f_X(t) dt

Summary Table

Distribution Type CDF Behavior Derivative or Difference
Discrete Step function PMF = jump size
Continuous Continuous, differentiable PDF = derivative
Mixed Jumps + continuous Combination

Computation and Examples

Discrete Example: Bernoulli

Random variable X with P(X=1)=p, P(X=0)=1-p. CDF:

F_X(x) = 0, x < 0 1 - p, 0 ≤ x < 1 1, x ≥ 1

Continuous Example: Uniform(0,1)

CDF is linear in support interval:

F_X(x) = 0, x < 0 x, 0 ≤ x ≤ 1 1, x > 1

Computational Methods

Numerical integration for continuous distributions without closed form. Summation for discrete.

Software Implementations

Functions available in R (pnorm, pbinom), Python (scipy.stats.cdf), MATLAB (cdf).

Example Plot Description

Plotting CDF shows step jumps for discrete, smooth curves for continuous distributions.

Applications in Probability and Statistics

Probability Computation

Calculate probabilities for intervals: P(a < X ≤ b) = F_X(b) - F_X(a).

Statistical Inference

Used in hypothesis testing, confidence interval construction.

Simulation

Inverse transform method uses inverse CDF to generate random samples.

Risk Assessment

Modeling cumulative probabilities of losses or events up to thresholds.

Reliability Engineering

Failure time distributions analyzed via CDFs.

Continuity and Discontinuity

Right-Continuity Defined

F_X is continuous from the right at every x: lim_{h→0+} F_X(x + h) = F_X(x).

Points of Discontinuity

Discrete distributions have jump discontinuities at points with positive mass.

Implications

Jumps correspond to probabilities of exact values in discrete variables.

Continuous Distributions

No jumps; CDF continuous everywhere.

Mixed Cases

Combine continuous intervals and jump points.

Inverse CDF and Quantile Function

Definition

Inverse CDF or quantile function Q(p) = inf{x: F_X(x) ≥ p} for p ∈ [0,1].

Usage

Generates random variables from uniform samples (inverse transform sampling).

Properties

Non-decreasing, right-continuous. Exists for all distributions.

Examples

Uniform(0,1): Q(p) = p; Normal distribution quantiles via numerical methods.

Computational Algorithms

Numerical inversion methods: bisection, Newton-Raphson.

Multivariate Cumulative Distribution Functions

Definition

CDF of vector X = (X_1, ..., X_n): F_X(x_1, ..., x_n) = P(X_1 ≤ x_1, ..., X_n ≤ x_n).

Properties

Non-decreasing in each argument, right-continuous, limits approach zero or one at infinities.

Marginal Distributions

Obtained by fixing variables at infinity; e.g., F_{X_1}(x_1) = lim_{x_2→∞} ... F_X(x_1, x_2, ...).

Copulas

Functions that join univariate marginals to form multivariate CDFs. Capture dependence structure.

Applications

Modeling joint risks, multivariate statistical inference, dependence analysis.

Estimation from Data

Empirical CDF

Defined as F_n(x) = (1/n) ∑_{i=1}^n I_{X_i ≤ x} using sample data. Step function with jumps at data points.

Properties

Non-decreasing, right-continuous, converges uniformly to true CDF (Glivenko-Cantelli theorem).

Smoothing Techniques

Kernel smoothing applied to empirical CDF to estimate continuous distributions.

Confidence Bands

Dvoretzky–Kiefer–Wolfowitz inequality provides bounds on estimation error.

Practical Considerations

Sample size, data quality impact accuracy of empirical CDF.

Limitations and Considerations

Non-uniqueness of PDF

Different distributions can share same CDF if defined on null sets.

Discontinuities in Mixed Distributions

Complicate differentiation and inversion procedures.

Numerical Challenges

Computing CDFs without closed form requires approximation, may induce errors.

Multivariate Complexity

Higher dimensions increase computational and interpretive difficulty.

Interpretation Nuances

Requires understanding of underlying random variable type (discrete, continuous, mixed).

Advanced Topics and Extensions

Generalized CDFs

Extensions to random elements in abstract spaces, distribution functions on metric spaces.

Stochastic Processes

CDFs for processes describe joint distributions over time or indices.

Conditional CDFs

Define distributions conditional on events or sigma-algebras. Basis for Bayesian inference.

Order Statistics

CDFs of sorted samples: describe distribution of minima, maxima, medians.

Extreme Value Theory

Study of limiting distributions for maxima/minima, involves specific CDF forms.

References

  • Feller, W. "An Introduction to Probability Theory and Its Applications", Vol. 2, Wiley, 1971, pp. 185-220.
  • Casella, G. and Berger, R. L. "Statistical Inference", Duxbury, 2002, pp. 120-150.
  • Billingsley, P. "Probability and Measure", 3rd ed., Wiley, 1995, pp. 230-260.
  • Durrett, R. "Probability: Theory and Examples", 4th ed., Cambridge University Press, 2010, pp. 90-115.
  • Shorack, G. R. and Wellner, J. A. "Empirical Processes with Applications to Statistics", SIAM, 1986, pp. 50-85.
!main_footer!