The Flynn Effect: Why IQ Scores Keep Rising and What It Means

In 1984, James Robert Flynn, a political philosopher at the University of Otago in New Zealand, published a finding that shook the foundations of intelligence research. After analyzing IQ test data from 14 nations spanning several decades, Flynn demonstrated that raw scores on standardized intelligence tests had been rising steadily --- approximately 3 points per decade --- across every country with available data. The magnitude was staggering. Americans taking the original 1932 Stanford-Binet would score, on average, around 115 by 1978 norms. Conversely, the average American from 1932, transported forward and tested against 1978 norms, would score approximately 85.

This was not a minor statistical curiosity. A 15-point shift in a single generation represents a full standard deviation on the IQ scale. If taken literally, it implied that the average person in 1932 would fall in the bottom 16th percentile by modern standards --- a claim that seemed absurd on its face but was supported by robust psychometric data across multiple test instruments and populations.

Flynn himself did not believe people were actually becoming more intelligent in any deep sense. He called it a paradox, and the research community named it after him: the Flynn Effect.


The Discovery: What Flynn Actually Found

Flynn's original 1984 paper in Psychological Bulletin examined data from the United States, and his follow-up 1987 paper extended the analysis to 14 industrialized nations. The pattern was remarkably consistent.

IQ Score Gains by Country (Approximate Points Per Decade):

Country Period Analyzed Gain Per Decade Test Used
Netherlands 1952--1982 7.0 Raven's Progressive Matrices
Belgium 1958--1967 5.7 Military selection tests
Israel 1954--1984 5.5 Military psychometric battery
Norway 1954--1980 3.2 Military conscript tests
United States 1932--1978 3.0 Stanford-Binet, Wechsler
Britain 1938--1979 2.7 Raven's, Mill Hill
Japan 1950--1980 7.7 Japanese WISC
France 1949--1974 3.9 ECNI battery

Two features of these data surprised researchers. First, gains were largest on tests of fluid intelligence --- abstract reasoning, pattern recognition, novel problem solving (measured by instruments like Raven's Progressive Matrices) --- rather than on tests of crystallized intelligence --- vocabulary, factual knowledge, verbal comprehension. This was counterintuitive: if better education and more reading were driving the gains, crystallized intelligence should have shown the largest increases. Instead, the opposite was true.

Second, the gains were not concentrated in any particular segment of the IQ distribution. They appeared across the entire range, though some studies found slightly larger gains at the lower end, suggesting a floor effect being lifted rather than the top end being stretched.

"The gains are massive, averaging about 15 IQ points in a single generation. Yet we hesitate to call people more intelligent, because the implications seem absurd. Are we really to believe that our grandparents were borderline intellectually disabled?" --- James R. Flynn, What Is Intelligence? (2007)


Proposed Explanations: Why Scores Rose

No single explanation accounts for the Flynn Effect. The leading hypotheses each explain part of the picture but leave residual questions.

Nutrition and Health

The most straightforward explanation points to improved nutrition, particularly during prenatal development and early childhood. Severe malnutrition during critical developmental windows permanently reduces cognitive capacity. As industrialized nations eliminated widespread malnutrition during the 20th century, a population-level cognitive floor was raised.

Supporting evidence:

  • The Flynn Effect is largest in countries that underwent the most dramatic nutritional improvements during the measurement period
  • Iodine supplementation programs in regions with endemic deficiency produced measurable IQ gains of 8--15 points
  • Mean height increased in parallel with IQ gains across the same populations and time periods, suggesting a shared nutritional driver
  • Lead exposure reductions (particularly the elimination of leaded gasoline) removed a known neurotoxin that disproportionately affected lower-SES children

However, nutrition alone cannot explain the full magnitude of the gains. The Netherlands showed gains of 7 points per decade even during periods when nutritional status was already high by historical standards. And the gains on fluid intelligence exceeded those on crystallized intelligence, which is difficult to reconcile with a purely biological mechanism.

Education and Cognitive Stimulation

Average years of schooling increased dramatically during the 20th century. In the United States, the median years of education rose from 8.1 in 1910 to 12.9 by 1990. More school means more exposure to abstract thinking, categorization, and the type of logical reasoning that fluid intelligence tests measure.

"Our ancestors were not unintelligent; they lived in a world that did not require, and therefore did not cultivate, the kind of abstract thinking that modern IQ tests measure. A farmer in 1900 had no reason to think about hypothetical syllogisms." --- James R. Flynn, Are We Getting Smarter? (2012)

Flynn argued that industrialization created a "scientific spectacles" effect: modern education trains people to think in abstract categories rather than concrete, utilitarian terms. When asked "What do dogs and rabbits have in common?", a pre-industrial farmer might answer "You use dogs to hunt rabbits" (a concrete, functional response). A modern test-taker answers "They are both mammals" (an abstract, taxonomic response). Both answers reflect intelligence, but IQ tests reward the latter.

Environmental Complexity

The modern environment is cognitively denser than the environment of 1900. Consider the differences:

  • Visual media: Television, film, video games, and internet content require processing complex visual information, tracking multiple narrative threads, and extracting rules from novel situations
  • Technology interaction: Operating smartphones, navigating software interfaces, and managing digital information require sustained abstract reasoning
  • Workplace demands: The shift from agricultural and manufacturing labor to service and knowledge work requires more abstract cognitive skills
  • Urban environments: Navigating complex urban systems (public transit, bureaucratic institutions, financial products) demands the kind of rule-based reasoning that fluid intelligence tests measure

Steven Johnson argued in Everything Bad Is Good for You (2005) that popular media --- particularly video games and complex television narratives --- has been training fluid intelligence for decades. The average video game requires players to discover rules through experimentation, manage multiple variables simultaneously, and adapt to novel challenges --- precisely the skills tested by Raven's Progressive Matrices.

Test Familiarity and Test-Taking Skills

A less discussed but potentially significant factor: populations have become more familiar with standardized testing itself. People who have taken multiple standardized tests throughout their schooling approach a novel IQ test with strategies, confidence, and format familiarity that their grandparents lacked. This does not mean they are more intelligent; it means they are better at taking tests.

Research by Teasdale and Owen (2005) in Denmark suggested that up to 2--3 points per decade of the Flynn Effect could be attributed to increased test sophistication rather than genuine cognitive change.


The Reverse Flynn Effect: When Scores Start Falling

Beginning in the late 1990s, researchers in several Scandinavian countries documented something unexpected: IQ scores had stopped rising and, in some populations, had begun to decline.

Documented Score Declines:

Country Period Change Per Decade Source
Norway 1993--2003 -0.38 points Bratsberg & Rogeberg (2018)
Denmark 1998--2014 -1.5 points Dutton & Lynn (2013)
Finland 1997--2009 -2.0 points Dutton et al. (2016)
France 1999--2008 -3.8 points Dutton & Lynn (2015)
United Kingdom 1980--2008 -2.5 points (fluid) Shayer & Ginsburg (2009)

These declines, often called the negative Flynn Effect or reverse Flynn Effect, triggered heated debate. Several explanatory frameworks have been proposed:

Dysgenic fertility hypothesis: Richard Lynn and others argued that individuals with lower IQ scores tend to have more children than those with higher scores, producing a gradual population-level decline in genotypic intelligence. This is the most controversial explanation, carrying uncomfortable eugenic overtones that many researchers reject on both empirical and ethical grounds.

Immigration composition effects: Some researchers suggested that immigration from countries with lower average test scores could produce statistical declines without any change in the cognitive abilities of native-born populations. Bratsberg and Rogeberg (2018) tested this in Norway and found the decline occurred within families --- younger siblings scored lower than older siblings on military conscription tests --- ruling out immigration as the primary cause.

"The decline is happening within families, which means it cannot be explained by changes in who is having children. Something environmental has changed." --- Bernt Bratsberg and Ole Rogeberg, Proceedings of the National Academy of Sciences (2018)

Diminishing returns on environmental gains: The most parsimonious explanation may be that the environmental factors driving the original Flynn Effect (nutrition, education, reduced disease burden) have reached a ceiling in wealthy nations. Once malnutrition is eliminated, iodine deficiency corrected, and universal education achieved, further gains require qualitatively different interventions. Meanwhile, new environmental factors --- increased screen time replacing physical play, reduced sleep duration, changes in educational methodology --- may be introducing modest negative pressures.


What the Flynn Effect Reveals About Intelligence Measurement

The Flynn Effect forced the psychometric community to confront a fundamental question: what do IQ tests actually measure?

If intelligence is a fixed biological property, largely heritable and stable across generations, how can it change by 15 points in 50 years? Heritability estimates for IQ in twin studies range from 0.50 to 0.80, implying a strong genetic component. But the Flynn Effect demonstrates that the phenotypic expression of intelligence is highly sensitive to environmental conditions.

This is not a contradiction. Height is also highly heritable (heritability of ~0.80), yet average height increased by approximately 10 centimeters in industrialized nations during the 20th century as nutrition improved. Heritability describes the proportion of variation within a population at a given time that is attributable to genetic differences. It says nothing about whether the population mean can shift due to environmental changes.

Flynn himself drew an analogy to basketball. If you compare basketball skills within a generation, genetic differences in height, speed, and coordination explain much of the variation. But if you compare basketball skills between 1950 and 2020, the massive improvement reflects better training, nutrition, coaching, and competitive infrastructure --- environmental factors operating on a highly heritable trait.

The implications for intelligence testing are profound. IQ tests must be renormed every 15--20 years to maintain the convention that the population mean is 100. A score of 100 on a 1950 test and a score of 100 on a 2020 test do not represent the same absolute level of performance. They represent the same relative position within the contemporary population. This observation has sometimes informed debates about how evolving population-level cognition reshapes scientific communication --- particularly around whether rising baseline cognitive skills have changed how researchers must frame complexity for public audiences.

This relativity has practical consequences. When older test norms are used for clinical or legal decisions --- determining intellectual disability diagnoses, death penalty eligibility, or special education placement --- the Flynn Effect introduces systematic bias. An individual scored against outdated norms will appear more intelligent than they actually are relative to their peers.


The g Factor Debate: Is General Intelligence Actually Increasing?

The most contentious question in Flynn Effect research: are gains occurring on g, the general factor of intelligence that psychometricians consider the core of cognitive ability?

Arguments that Flynn Effect gains are on g:

  • Gains appear on multiple diverse tests, not just a single instrument
  • The pattern of gains (largest on fluid intelligence) aligns with tests that are most g-loaded
  • If gains were just test-taking artifacts, they should appear equally on all subtests, but they don't

Arguments that Flynn Effect gains are NOT on g:

  • Jensen's finding that the magnitude of gains across Wechsler subtests does not correlate with the g-loadings of those subtests
  • The gains show a different pattern than the pattern of group differences (the "Jensen Effect"), suggesting different underlying causes
  • Real-world achievements have not increased commensurately --- if genuine intelligence increased by 15 points, we should see dramatically more innovation, scientific breakthroughs, and creative achievement per capita

Te Nijenhuis and van der Flier (2013) conducted a meta-analysis of 12 studies and concluded that Flynn Effect gains were negatively correlated with g-loadings: the subtests most saturated with g showed the smallest gains. This finding, if robust, suggests the Flynn Effect reflects improvements in specific cognitive skills rather than an increase in general intelligence.

The debate remains unresolved. It touches on the deepest questions in intelligence research: whether g is a real cognitive entity or a statistical artifact, whether fluid and crystallized intelligence are truly distinct, and whether population-level trends can be meaningfully compared to individual-level variation.


Cross-Species Parallels and Evolutionary Context

The Flynn Effect is sometimes discussed in isolation, as if human cognitive change occurs in a vacuum. But cognitive evolution operates on broader biological principles. Animal intelligence research has documented cases of rapid cognitive adaptation in non-human species under environmental pressure --- urban-dwelling crows developing novel tool use within decades, octopus populations showing measurable problem-solving improvement in enriched environments, and domesticated animals exhibiting reduced cognitive flexibility compared to wild counterparts.

These parallels suggest that the Flynn Effect may partly reflect a species-level response to increasing environmental complexity, not unlike the cognitive adaptations observed in animals exposed to enriched versus impoverished environments. The mechanism is not genetic evolution (too slow) but rather developmental plasticity --- the capacity of the brain to respond to environmental demands during critical periods of growth.


Practical Implications for Testing and Assessment

The Flynn Effect has direct consequences for anyone involved in cognitive assessment, educational placement, or professional certification.

Clinical diagnosis: A person assessed with a test normed 15 years ago will score approximately 4.5 points higher than they would on a currently normed test. For borderline diagnoses (e.g., intellectual disability threshold of IQ 70), this difference is clinically significant. The American Psychological Association recommends using the most recently normed version of any IQ test.

Legal proceedings: In Atkins v. Virginia (2002), the U.S. Supreme Court banned execution of individuals with intellectual disability. The Flynn Effect became legally relevant because defendants assessed with older test norms might score above the IQ 70 threshold despite genuinely qualifying for intellectual disability on current norms.

Educational trends: Rising baseline scores mean that educational standards and certification examination difficulty must escalate to maintain discriminative validity. What constituted an advanced question in 1980 may be average difficulty today, not because the content changed but because the cognitive baseline of the test-taking population shifted upward.

Cross-generational comparison: Any comparison of cognitive test scores across time periods must account for the Flynn Effect. Claims that "children today are smarter/dumber than their parents" are meaningless without normative adjustment.


Where the Research Stands Now

The Flynn Effect remains one of the most important and least fully explained phenomena in psychology. Several active research programs continue to investigate its mechanisms, trajectory, and implications.

Current consensus points:

  • The effect is real and has been replicated across dozens of countries and test instruments
  • Multiple causes operate simultaneously; no single factor is sufficient
  • The effect appears to have plateaued or reversed in some wealthy nations
  • Gains are largest on fluid intelligence measures, particularly Raven's Progressive Matrices
  • The effect does not straightforwardly map onto the g factor, complicating its interpretation
  • Environmental factors dominate over genetic factors in explaining the trend

Unresolved questions:

  • Will the reverse Flynn Effect spread to developing nations as they reach nutritional and educational ceilings?
  • How much of the original effect was genuine cognitive improvement versus test sophistication?
  • What environmental changes in wealthy nations are driving the reversal?
  • Can targeted interventions (e.g., cognitive training, educational reform) restart the gains?

"The Flynn Effect tells us that intelligence, however we define it, is not fixed --- not at the individual level and certainly not at the population level. It responds to the world we build around us." --- Ulric Neisser, The Rising Curve (1998)

The Flynn Effect does not tell us that humanity is getting smarter in some absolute sense. What it reveals, with considerable force, is that the cognitive abilities measured by IQ tests are far more malleable than the psychometric tradition historically assumed. The environment shapes the mind, and when environments change rapidly --- as they did throughout the 20th century --- minds change with them.


References

  1. Flynn, J. R. (1984). The mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin, 95(1), 29--51. doi:10.1037/0033-2909.95.1.29

  2. Flynn, J. R. (1987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological Bulletin, 101(2), 171--191. doi:10.1037/0033-2909.101.2.171

  3. Bratsberg, B., & Rogeberg, O. (2018). Flynn effect and its reversal are both environmentally caused. Proceedings of the National Academy of Sciences, 115(26), 6674--6678. doi:10.1073/pnas.1718793115

  4. Te Nijenhuis, J., & van der Flier, H. (2013). Is the Flynn effect on g? A meta-analysis. Intelligence, 41(4), 169--175. doi:10.1016/j.intell.2013.03.001

  5. Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks. Psychological Bulletin, 132(3), 354--380. doi:10.1037/0033-2909.132.3.354

  6. Neisser, U. (Ed.). (1998). The Rising Curve: Long-Term Gains in IQ and Related Measures. American Psychological Association. doi:10.1037/10270-000

  7. Flynn, J. R. (2007). What Is Intelligence? Beyond the Flynn Effect. Cambridge University Press. doi:10.1017/CBO9780511605253

  8. Pietschnig, J., & Voracek, M. (2015). One century of global IQ gains: A formal meta-analysis of the Flynn effect (1909--2013). Perspectives on Psychological Science, 10(3), 282--306. doi:10.1177/1745691615577701