Bernoulli - probability | What's Your IQ

Definition

Overview

Bernoulli distribution: simplest discrete distribution. Models single binary experiment. Outcomes: success (1) or failure (0). Parameter: p = probability of success.

Mathematical form

Random variable X takes values {0,1}. Probability mass function (PMF) defined as:

P(X = x) = p^x (1 - p)^{1 - x}, x ∈ {0,1}

Historical background

Introduced by Jacob Bernoulli (1654–1705). Foundation for probability theory. Basis for binomial distribution and hypothesis testing.

"The theory of probability is the mathematics of uncertainty." -- Jacob Bernoulli

Parameters

Parameter p

p ∈ [0,1]; probability of success. Controls distribution shape. p=0 degenerate at 0; p=1 degenerate at 1.

Interpretation

p represents likelihood of event occurring. Complement (1-p) is failure probability.

Parameter estimation

Maximum likelihood estimator (MLE): sample mean of observed outcomes. Consistent and unbiased for p.

Probability Mass Function (PMF)

Definition

PMF assigns probability to each possible outcome (0 or 1). Formula:

f(x) = P(X = x) = p^x (1-p)^{1-x}, x ∈ {0,1}

Tabulated PMF

Outcome (x)	Probability P(X=x)
0 (failure)	1 - p
1 (success)	p

Properties

Sum of probabilities: 1. Support: {0,1}. Discrete distribution.

Mean and Variance

Mean (Expected Value)

Formula:

E[X] = p

Variance

Formula:

Var(X) = p(1 - p)

Interpretation

Mean equals probability of success. Variance maximal at p=0.5, minimal at p=0 or 1.

Bernoulli Trials

Definition

Sequence of independent Bernoulli experiments. Each trial: binary outcome with same p.

Independence assumption

Trials are independent: result of one does not affect others.

Applications

Model coin tosses, pass/fail tests, yes/no surveys, and other binary phenomena.

Relationship to Binomial Distribution

Binomial as sum of Bernoulli trials

Binomial distribution: sum of n independent Bernoulli random variables with parameter p.

Notation

If X_i ∼ Bernoulli(p), then S_n = ∑_{i=1}^n X_i ∼ Binomial(n,p).

Implications

Bernoulli is building block for binomial. Single trial case: binomial with n=1.

Applications

Statistics

Hypothesis testing for binary data. Estimating probabilities of success/failure.

Computer Science

Modeling random binary flags, Bernoulli processes in algorithms, randomized decision making.

Engineering

Reliability testing of components: pass/fail outcomes.

Economics and Social Sciences

Modeling yes/no survey responses, binary choices, and success/failure events.

Properties

Support

Discrete set {0,1} only.

Moments

All moments exist. E[X^k] = p for all k≥1 because X^k=X when X∈{0,1}.

Memorylessness

Bernoulli distribution is not memoryless. Only geometric and exponential share that property.

Skewness and kurtosis

Skewness = (1 - 2p)/√(p(1-p)). Kurtosis = (1 - 6p(1-p)) / (p(1-p)).

Moment Generating Function

Definition

MGF M_X(t) = E[e^{tX}].

Formula

M_X(t) = (1 - p) + p e^{t}

Usage

MGF used to derive moments, analyze sums of independent Bernoulli variables.

Entropy

Definition

Entropy measures uncertainty of Bernoulli variable.

Formula

H(X) = - p log_2 p - (1 - p) log_2 (1 - p)

Interpretation

Entropy maximal at p=0.5 (1 bit). Minimal (0) at p=0 or 1 (certainty).

Simulation

Generating Bernoulli random variables

Method: generate uniform random number U ∈ [0,1]. If U ≤ p, output 1; else 0.

Algorithm

function Bernoulli(p): U = Uniform(0,1) if U ≤ p: return 1 else: return 0

Software implementations

Available in most statistical packages: R (rbinom with n=1), Python (numpy.random.binomial with n=1), MATLAB (binornd with n=1).

Limitations

Binary outcome restriction

Only models two outcomes. Cannot represent multi-category or continuous data.

Parameter simplicity

Single parameter p limits modeling complexity. Cannot capture varying success probabilities per trial.

Independence assumption

Assumes independent trials. Real-world data may violate independence.

Non-memoryless

Not suitable for processes requiring memoryless properties.

References

Feller, W., "An Introduction to Probability Theory and Its Applications," Vol. 1, Wiley, 1968, pp. 54-58.
Ross, S. M., "Introduction to Probability Models," 11th Edition, Academic Press, 2014, pp. 52-56.
Grimmett, G., Stirzaker, D., "Probability and Random Processes," 3rd Edition, Oxford University Press, 2001, pp. 35-39.
Casella, G., Berger, R. L., "Statistical Inference," 2nd Edition, Duxbury Press, 2002, pp. 123-126.
Lehmann, E. L., Romano, J. P., "Testing Statistical Hypotheses," 3rd Edition, Springer, 2005, pp. 20-25.