Definition
Origin and Concept
Chi square distribution arises from the sum of squares of independent standard normal variables. Formally, if Zi ~ N(0,1) are independent for i = 1,...,k, then X = ∑i=1kZi2 follows a chi square distribution with k degrees of freedom.
Degrees of Freedom (k)
Parameter k represents the number of independent standard normal variables squared and summed. It controls shape and scale. k ∈ ℕ, k ≥ 1.
Support and Range
Chi square distribution is defined only for non-negative real numbers: X ∈ [0, ∞). It is right-skewed with shape depending on k.
Properties
Non-Negativity
Values are always ≥ 0. Distribution describes sum of squared normals, which can't be negative.
Skewness
Right-skewed for low k. Skewness decreases as k increases, tending toward normality by central limit theorem.
Mean and Variance
Mean = k. Variance = 2k. Both scale linearly with degrees of freedom.
Mode
Mode = max(k - 2, 0) for k ≥ 1. Mode shifts rightward as degrees of freedom increase.
Moment Existence
All positive integer moments exist and can be expressed in terms of gamma functions and factorials.
Probability Density Function
General Formula
The pdf for chi square distribution with k degrees of freedom is given by:
f(x; k) = (1 / (2^(k/2) * Γ(k/2))) * x^(k/2 - 1) * e^(-x/2), x > 0Gamma Function Role
Γ(·) normalizes pdf; Γ(k/2) generalizes factorial for non-integers. Essential for continuous distributions.
Shape Behavior
For small k, pdf peaks near zero, heavily skewed. For large k, pdf approaches normal shape.
Cumulative Distribution Function
Definition
CDF expresses P(X ≤ x). Computed through lower incomplete gamma function or regularized gamma function.
Formula
F(x; k) = γ(k/2, x/2) / Γ(k/2)where γ(·,·) is the lower incomplete gamma function.
Numerical Evaluation
Direct calculation complex; numerical libraries and tables used for practical computation.
Moment Generating Function
Definition
MGF of chi square distribution helps derive moments and analyze sums of independent variables.
Expression
M_X(t) = (1 - 2t)^(-k/2), t < 1/2Applications
Used in variance calculations, central limit theorem approximations, and deriving cumulants.
Characteristic Function
Definition
Characteristic function (CF) encodes full distribution information; useful in distribution theory and convolutions.
Formula
φ_X(t) = (1 - 2it)^(-k/2), t ∈ ℝProperties
CF uniquely determines chi square distribution. Enables proofs of convergence and moment computations.
Relationships with Other Distributions
Gamma Distribution
Chi square is a special case of gamma distribution with shape α = k/2, scale β = 2.
Normal Distribution
Sum of squared standard normal variables leads to chi square. Central limit theorem applies as k increases.
F Distribution
Ratio of two independent scaled chi square variables follows an F distribution.
Exponential Distribution
Chi square with k=2 is equivalent to exponential distribution with mean 2.
Applications
Goodness of Fit Tests
Used to test observed versus expected frequency distributions in categorical data.
Test of Independence
Applied in contingency tables to evaluate independence between categorical variables.
Variance Estimation
Confidence intervals for population variance of normal distribution derived using chi square distribution.
Model Fit and Residual Analysis
Assess residual sum of squares in regression and model adequacy.
Parameter Estimation
Degrees of Freedom Determination
Usually known from sample size and number of parameters estimated. Crucial for correct test statistics.
Sample Variance-Based Estimation
Chi square distributed test statistic formed by (n-1)S²/σ² where S² is sample variance.
Maximum Likelihood Estimation
MLE for chi square parameters coincides with method of moments due to fixed parameter k.
Hypothesis Testing
Null Hypothesis Framework
Null hypothesis often specifies expected frequencies or variance values tested using chi square statistics.
Test Statistic Construction
Sum of squared standardized residuals or deviations form the test statistic, which follows chi square under H₀.
Decision Criteria
Compare calculated statistic to critical value from chi square table at chosen significance level α.
Tables and Critical Values
Standard Chi Square Tables
Provide critical values for various α levels and degrees of freedom for hypothesis testing.
Example Critical Values
| Degrees of Freedom (k) | α = 0.05 | α = 0.01 |
|---|---|---|
| 1 | 3.841 | 6.635 |
| 5 | 11.070 | 15.086 |
| 10 | 18.307 | 23.209 |
Interpolation and Software
Intermediate values found by interpolation or software (R, Python, MATLAB) providing exact p-values.
Limitations and Assumptions
Independence Assumption
Assumes independent samples or observations. Correlated data invalidate chi square test assumptions.
Sample Size
Chi square approximations require sufficiently large sample sizes for validity.
Normality Requirement
Underlying variables assumed normal or approximated by normality in variance-related tests.
Non-Negativity Constraint
Distribution only applicable for non-negative sums of squares; not suitable for negative-valued statistics.
References
- Casella, G., & Berger, R. L. Statistical Inference. Duxbury, 2002, pp. 423-452.
- Hogg, R. V., McKean, J., & Craig, A. T. Introduction to Mathematical Statistics. Pearson, 2019, pp. 300-320.
- Wilks, S. S. The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Annals of Mathematical Statistics, Vol. 9, 1938, pp. 60-62.
- Mendenhall, W., Beaver, R. J., & Beaver, B. M. Introduction to Probability and Statistics. Cengage Learning, 2012, pp. 421-440.
- Rice, J. A. Mathematical Statistics and Data Analysis. Cengage Learning, 2006, pp. 450-470.