Definition

Overview

Geometric distribution models the number of independent Bernoulli trials needed to achieve the first success. Each trial has two outcomes: success (probability p) or failure (probability 1-p).

Two Common Versions

Version 1: Counts trials till first success (X ∈ {1, 2, 3, …}).
Version 2: Counts failures before first success (Y ∈ {0, 1, 2, …}).

Bernoulli Process

Trials are independent, identically distributed (i.i.d.) with constant success probability p. Geometric describes waiting time for first success in this process.

Properties

Discreteness

Support is countably infinite, discrete values representing trial counts or failures.

Parameter

Single parameter p ∈ (0,1), probability of success per trial.

Distribution Shape

Monotonically decreasing PMF for p > 0; skewed right with heavier tail for small p.

Memoryless

Geometric is the only discrete distribution with the memoryless property: P(X > m+n | X > m) = P(X > n).

Support Differences

Note: Different textbooks adopt different supports; be consistent when applying formulas.

PMF and CDF

Probability Mass Function (PMF)

For version counting trials (X):

P(X = k) = (1 - p)^(k-1) * p, k = 1, 2, 3, ...

For version counting failures (Y):

P(Y = k) = (1 - p)^k * p, k = 0, 1, 2, ...

Cumulative Distribution Function (CDF)

For X:

F(k) = P(X ≤ k) = 1 - (1 - p)^k, k = 1, 2, 3, ...

For Y:

F(k) = P(Y ≤ k) = 1 - (1 - p)^(k+1), k = 0, 1, 2, ...

Survival Function

For version X:

S(k) = P(X > k) = (1 - p)^k

Expectation and Variance

Expectation

Version X (trials until first success):

E[X] = 1 / p

Version Y (failures before first success):

E[Y] = (1 - p) / p

Variance

Version X:

Var(X) = (1 - p) / p^2

Version Y:

Var(Y) = (1 - p) / p^2

Higher Moments

Skewness: (2 - p) / sqrt(1 - p)
Kurtosis: 6 + p^2 / (1 - p)

Memoryless Property

Definition

Distribution satisfies: P(X > m + n | X > m) = P(X > n) for all m, n ≥ 0.

Implication

Past failures do not affect future success probability; no aging.

Uniqueness

Geometric is the only discrete memoryless distribution; exponential is the continuous analog.

Relation to Other Distributions

Bernoulli Distribution

Geometric is the distribution of count of trials until first Bernoulli success.

Negative Binomial Distribution

Geometric is a special case of Negative Binomial with number of successes r = 1.

Exponential Distribution

Geometric is discrete analog of Exponential distribution for waiting times.

Binomial Distribution

Binomial counts successes in fixed trials; geometric counts trials until first success.

Parameter Estimation

Maximum Likelihood Estimation (MLE)

Given sample {x1, x2, ..., xn} from geometric (version X):

MLE of p = n / (Σ x_i)

Method of Moments

Estimate p by equating sample mean to theoretical mean:

p ≈ 1 / (sample mean)

Properties

MLE is unbiased and consistent for large samples.

Applications

Reliability Engineering

Modeling number of inspections until first failure.

Quality Control

Estimating number of items tested until defect found.

Telecommunications

Waiting times in packet transmissions or error occurrences.

Biology

Number of trials until a successful mutation or event.

Computer Science

Algorithm analysis for randomized processes with first success events.

Simulation Techniques

Inverse Transform Sampling

Generate U ~ Uniform(0,1), then:

X = ceil(log(1 - U) / log(1 - p))

Rejection Sampling

Rarely used due to simplicity of inverse transform.

Direct Sampling

Simulate Bernoulli trials until first success; count trials.

Examples

Example 1: Coin Toss

Probability of heads p = 0.5; expected tosses until first head: 2.

Example 2: Defect Detection

Item defect rate p = 0.1; expected inspections until finding defect: 10.

Example 3: Network Packet Loss

Packet success probability p = 0.95; expected transmissions until success: ~1.05.

Tables

PMF Values for Selected p and k

kp=0.2p=0.5p=0.8
10.200.500.80
20.160.250.16
30.1280.1250.032
40.10240.06250.0064

Expected Value and Variance for Different p

pE[X]Var(X)
0.11090
0.33.337.78
0.522
0.71.430.61

Formulas

PMF

P(X = k) = (1 - p)^(k-1) * p, k = 1, 2, 3, ...

CDF

F(k) = 1 - (1 - p)^k

Expectation

E[X] = 1 / p

Variance

Var(X) = (1 - p) / p^2

Memoryless Property

P(X > m + n | X > m) = P(X > n)

MLE for p

p̂ = n / Σ x_i

References

  • Feller, W. An Introduction to Probability Theory and Its Applications, Vol. 1, Wiley, 1968, pp. 56-60.
  • Ross, S. M. Introduction to Probability Models, 11th Ed., Academic Press, 2014, pp. 67-70.
  • Grimmett, G., & Stirzaker, D. Probability and Random Processes, 3rd Ed., Oxford University Press, 2001, pp. 45-48.
  • Casella, G., & Berger, R. L. Statistical Inference, 2nd Ed., Duxbury, 2002, pp. 210-213.
  • Durrett, R. Probability: Theory and Examples, 4th Ed., Cambridge University Press, 2010, pp. 55-58.