Definition

Origin and Concept

Chi square distribution arises from the sum of squares of independent standard normal variables. Formally, if Zi ~ N(0,1) are independent for i = 1,...,k, then X = ∑i=1kZi2 follows a chi square distribution with k degrees of freedom.

Degrees of Freedom (k)

Parameter k represents the number of independent standard normal variables squared and summed. It controls shape and scale. k ∈ ℕ, k ≥ 1.

Support and Range

Chi square distribution is defined only for non-negative real numbers: X ∈ [0, ∞). It is right-skewed with shape depending on k.

Properties

Non-Negativity

Values are always ≥ 0. Distribution describes sum of squared normals, which can't be negative.

Skewness

Right-skewed for low k. Skewness decreases as k increases, tending toward normality by central limit theorem.

Mean and Variance

Mean = k. Variance = 2k. Both scale linearly with degrees of freedom.

Mode

Mode = max(k - 2, 0) for k ≥ 1. Mode shifts rightward as degrees of freedom increase.

Moment Existence

All positive integer moments exist and can be expressed in terms of gamma functions and factorials.

Probability Density Function

General Formula

The pdf for chi square distribution with k degrees of freedom is given by:

f(x; k) = (1 / (2^(k/2) * Γ(k/2))) * x^(k/2 - 1) * e^(-x/2), x > 0

Gamma Function Role

Γ(·) normalizes pdf; Γ(k/2) generalizes factorial for non-integers. Essential for continuous distributions.

Shape Behavior

For small k, pdf peaks near zero, heavily skewed. For large k, pdf approaches normal shape.

Cumulative Distribution Function

Definition

CDF expresses P(X ≤ x). Computed through lower incomplete gamma function or regularized gamma function.

Formula

F(x; k) = γ(k/2, x/2) / Γ(k/2)

where γ(·,·) is the lower incomplete gamma function.

Numerical Evaluation

Direct calculation complex; numerical libraries and tables used for practical computation.

Moment Generating Function

Definition

MGF of chi square distribution helps derive moments and analyze sums of independent variables.

Expression

M_X(t) = (1 - 2t)^(-k/2), t < 1/2

Applications

Used in variance calculations, central limit theorem approximations, and deriving cumulants.

Characteristic Function

Definition

Characteristic function (CF) encodes full distribution information; useful in distribution theory and convolutions.

Formula

φ_X(t) = (1 - 2it)^(-k/2), t ∈ ℝ

Properties

CF uniquely determines chi square distribution. Enables proofs of convergence and moment computations.

Relationships with Other Distributions

Gamma Distribution

Chi square is a special case of gamma distribution with shape α = k/2, scale β = 2.

Normal Distribution

Sum of squared standard normal variables leads to chi square. Central limit theorem applies as k increases.

F Distribution

Ratio of two independent scaled chi square variables follows an F distribution.

Exponential Distribution

Chi square with k=2 is equivalent to exponential distribution with mean 2.

Applications

Goodness of Fit Tests

Used to test observed versus expected frequency distributions in categorical data.

Test of Independence

Applied in contingency tables to evaluate independence between categorical variables.

Variance Estimation

Confidence intervals for population variance of normal distribution derived using chi square distribution.

Model Fit and Residual Analysis

Assess residual sum of squares in regression and model adequacy.

Parameter Estimation

Degrees of Freedom Determination

Usually known from sample size and number of parameters estimated. Crucial for correct test statistics.

Sample Variance-Based Estimation

Chi square distributed test statistic formed by (n-1)S²/σ² where S² is sample variance.

Maximum Likelihood Estimation

MLE for chi square parameters coincides with method of moments due to fixed parameter k.

Hypothesis Testing

Null Hypothesis Framework

Null hypothesis often specifies expected frequencies or variance values tested using chi square statistics.

Test Statistic Construction

Sum of squared standardized residuals or deviations form the test statistic, which follows chi square under H₀.

Decision Criteria

Compare calculated statistic to critical value from chi square table at chosen significance level α.

Tables and Critical Values

Standard Chi Square Tables

Provide critical values for various α levels and degrees of freedom for hypothesis testing.

Example Critical Values

Degrees of Freedom (k)α = 0.05α = 0.01
13.8416.635
511.07015.086
1018.30723.209

Interpolation and Software

Intermediate values found by interpolation or software (R, Python, MATLAB) providing exact p-values.

Limitations and Assumptions

Independence Assumption

Assumes independent samples or observations. Correlated data invalidate chi square test assumptions.

Sample Size

Chi square approximations require sufficiently large sample sizes for validity.

Normality Requirement

Underlying variables assumed normal or approximated by normality in variance-related tests.

Non-Negativity Constraint

Distribution only applicable for non-negative sums of squares; not suitable for negative-valued statistics.

References

  • Casella, G., & Berger, R. L. Statistical Inference. Duxbury, 2002, pp. 423-452.
  • Hogg, R. V., McKean, J., & Craig, A. T. Introduction to Mathematical Statistics. Pearson, 2019, pp. 300-320.
  • Wilks, S. S. The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Annals of Mathematical Statistics, Vol. 9, 1938, pp. 60-62.
  • Mendenhall, W., Beaver, R. J., & Beaver, B. M. Introduction to Probability and Statistics. Cengage Learning, 2012, pp. 421-440.
  • Rice, J. A. Mathematical Statistics and Data Analysis. Cengage Learning, 2006, pp. 450-470.