Definition

Basic Concept

Interquartile Range (IQR): range between first quartile (Q1) and third quartile (Q3). Measures statistical dispersion of middle 50% of data. Represents spread of central data values, excluding extremes.

Quartiles Breakdown

Quartiles: values dividing dataset into four equal parts. Q1: 25th percentile. Median (Q2): 50th percentile. Q3: 75th percentile. IQR = Q3 − Q1.

Robustness

Resistant to outliers and extreme values. Unlike range, IQR focuses on central distribution. Preferred in skewed or non-normal datasets.

Calculation Methods

Step-by-Step Procedure

1. Sort data in ascending order. 2. Find Q1: median of lower half (below median). 3. Find Q3: median of upper half (above median). 4. Compute IQR = Q3 − Q1.

Median Inclusion Variants

Two approaches: include median in halves if odd number of data points (Tukey), or exclude median (Moore and McCabe). Slight differences in Q1 and Q3 values.

Use of Software

Statistical packages (R, Python, SPSS) provide built-in functions. Methods may vary slightly based on interpolation techniques.

Data SetSortedQ1Q3IQR
7, 15, 36, 39, 40, 41, 42, 43, 47, 497, 15, 36, 39, 40, 41, 42, 43, 47, 4936.543.57

Properties

Range of Values

IQR value ≥ 0. Zero indicates no spread between Q1 and Q3. Larger values indicate greater dispersion.

Scale Invariance

Linear transformations affect IQR proportionally. Multiplying data by constant k scales IQR by |k|.

Robustness to Outliers

Focus on middle 50% excludes influence of extreme values. Makes IQR suitable for skewed distributions.

Interpretation

Measure of Spread

IQR quantifies variability of central data. Reflects typical range where middle half of observations lie.

Skewness Indicator

Relative positions of Q1 and Q3 to median indicate skewness: Q3 − median vs. median − Q1 comparison.

Data Consistency

Smaller IQR implies more consistent data. Larger IQR suggests more variability or heterogeneity.

Applications

Descriptive Statistics

Summarizes spread alongside median. Common in reports, research papers, and exploratory data analysis.

Outlier Detection

Used to define fences for identifying outliers: lower fence = Q1 − 1.5×IQR, upper fence = Q3 + 1.5×IQR.

Boxplot Construction

Determines length of box representing middle 50% data. Visual tool for data distribution and spread.

Comparison with Other Measures

Range vs IQR

Range includes extremes, sensitive to outliers. IQR excludes extremes, more robust.

Standard Deviation vs IQR

Standard deviation assumes normality, sensitive to outliers. IQR requires no distribution assumptions.

Variance vs IQR

Variance quantifies average squared deviation, influenced by extremes. IQR focuses on central spread.

Advantages and Limitations

Advantages

Robustness to outliers. Simple to compute. Intuitive interpretation. Useful for non-normal data.

Limitations

Ignores data outside middle 50%. Less informative for symmetric, normal data. Not suitable for parametric tests.

Mitigation

Combine with other statistics (median, mean, standard deviation) for comprehensive analysis.

Role in Outlier Detection

Fence Methodology

Outliers: observations beyond fences defined by IQR. Lower fence = Q1 − 1.5×IQR, upper fence = Q3 + 1.5×IQR.

Extreme Outliers

Defined using 3×IQR fences. Emphasizes points far beyond typical spread.

Practical Use

Widely applied in data cleaning, quality control, and exploratory data analysis.

StatisticFormulaPurpose
Lower FenceQ1 − 1.5 × IQRDetect mild outliers
Upper FenceQ3 + 1.5 × IQRDetect mild outliers
Lower Extreme FenceQ1 − 3 × IQRDetect extreme outliers
Upper Extreme FenceQ3 + 3 × IQRDetect extreme outliers

Examples

Example 1: Simple Data Set

Data: 1, 2, 3, 4, 5, 6, 7, 8, 9

Sorted: same as data. Q1 = 3, Q3 = 7, IQR = 7 − 3 = 4.

Example 2: Skewed Data

Data: 5, 7, 8, 12, 15, 18, 22, 27, 30, 45

Q1 = 7.5, Q3 = 27, IQR = 19.5. Large IQR indicates spread in middle data despite skewness.

Example 3: Outlier Detection

Data: 10, 12, 15, 15, 16, 18, 22, 23, 24, 100

Q1 = 14, Q3 = 23.5, IQR = 9.5

Lower fence = 14 − 1.5×9.5 = −0.25 (no lower outliers)

Upper fence = 23.5 + 1.5×9.5 = 37.75 (100 > 37.75 ⇒ 100 is outlier)

Visualization Techniques

Boxplot

Box represents IQR; median shown inside box; whiskers extend to fences or data extremes; outliers plotted separately.

Quantile Plot

Displays quartiles as points; visually assesses spread and skewness.

Histogram with Quartiles

Overlay quartile lines on histogram to show distribution concentration.

Formulas and Algorithms

IQR Formula

IQR = Q3 − Q1Where:Q1 = 25th percentileQ3 = 75th percentile

Outlier Detection Formula

Lower Fence = Q1 − 1.5 × IQRUpper Fence = Q3 + 1.5 × IQRData point x is an outlier if:x < Lower Fence or x > Upper Fence

Algorithm to Calculate IQR

Input: Data set D with n valuesStep 1: Sort D ascendingStep 2: Compute median (Q2)Step 3: Split D into lower half (below median) and upper half (above median)Step 4: Compute Q1 as median of lower halfStep 5: Compute Q3 as median of upper halfStep 6: Calculate IQR = Q3 − Q1Output: IQR value

References

  • Wilcox, R.R., "Introduction to Robust Estimation and Hypothesis Testing", Academic Press, 2012, pp. 45-67.
  • McGill, R., Tukey, J.W., Larsen, W.A., "Variations of Boxplots", The American Statistician, Vol. 32, 1978, pp. 12-16.
  • Hyndman, R.J., Fan, Y., "Sample Quantiles in Statistical Packages", The American Statistician, Vol. 50, 1996, pp. 361-365.
  • Hoaglin, D.C., Iglewicz, B., Tukey, J.W., "Performance of Some Resistant Rules for Outlier Labeling", Journal of the American Statistical Association, Vol. 81, 1986, pp. 991-999.
  • Hyndman, R.J., "Computing and Graphing Highest Density Regions", The American Statistician, Vol. 50, 1996, pp. 120-126.