Walk into any sales kickoff, any executive offsite, any half-decent leadership development workshop, and there is a fair chance someone will at some point draw an inverted U on a whiteboard. The horizontal axis is labeled “arousal” or “stress” or “pressure.” The vertical axis is labeled “performance.” The line rises, peaks, and falls. The speaker — a sales VP, a chief people officer, an executive coach, sometimes a McKinsey-trained consultant who really should know better — explains that this is the Yerkes-Dodson Law, and that it proves you need some stress to perform at your best, but not too much. Too little, and you are bored. Too much, and you are overwhelmed. The sweet spot is right at the top of the curve, and the job of management (or coaching, or this $40,000 training program) is to keep your team there.
It is one of the more durable artifacts of pop psychology. The shape is intuitive. The story is satisfying. The implication — that there exists, for each task and each person, a knowable optimal level of arousal — slots neatly into the operational mind. If we just calibrate the pressure right, performance will be maximized. Stretch goals work because they push people up the rising part of the curve. Burnout happens when you push past the peak. The whole human-performance optimization industry, from quota design to gamification to the entire wellness category, leans on some version of this graph.
The Yerkes-Dodson Law has the unusual property, among the cases in this hub, of being neither fully debunked nor fully supported. The 1908 paper that started it really did find something. The “inverted U” shape really does show up in a number of well-controlled experimental contexts. But the broad, universal, applies-to-everything-from-mouse-learning-to-quarterly-sales-targets version of the law that you see on the whiteboard is mostly the work of later psychologists assembling a tidy story out of a much narrower original finding. The single best treatment of how this happened is a 1994 paper in Theory & Psychology by the Norwegian psychologist Karl Halvor Teigen, titled “Yerkes-Dodson: A law for all seasons.” Teigen went back to the original 1908 paper and traced, decade by decade, how the modern “Yerkes-Dodson Law” was constructed by later writers — most of whom never read the source material.
This article is about that construction. What did Yerkes and Dodson actually demonstrate in 1908? How did one specific finding about mouse learning under electric shock become a universal law about human performance under stress? Where does the inverted-U pattern actually replicate, and where does it fall apart? And — for strategists evaluating a vendor pitch, a productivity framework, or an internal initiative built on “optimal pressure” — what is honest to say about the underlying science?
What Yerkes and Dodson Actually Did In 1908
The foundational paper is Robert M. Yerkes and John D. Dodson (1908), “The relation of strength of stimulus to rapidity of habit-formation,” published in the Journal of Comparative Neurology and Psychology, volume 18, issue 5, pages 459 to 482. It is freely available in archive form because it predates current copyright. Anyone considering invoking the law in a meeting next quarter should spend twenty minutes reading the actual source.
The subjects were forty “dancing mice” — a strain of laboratory mouse, not human knowledge workers. The task was a brightness-discrimination habit. The mouse was placed in a chamber with two compartments, one illuminated and one dark. The correct response was to enter the brighter compartment. The wrong response was to enter the darker one. Mice that chose the wrong compartment received an electric shock through the wire-mesh floor; mice that chose the correct compartment did not. Trials were repeated until the mouse formed a stable habit of choosing the bright compartment — that is, until the rate of correct responses stabilized at the criterion level the experimenters set.
The independent variable was the strength of the electric shock used to punish errors. Yerkes and Dodson tested several stimulus intensities. They were specifically interested in whether weaker or stronger shocks would produce faster habit-formation — that is, whether mice would learn the discrimination in fewer trials at low, medium, or high shock intensity.
The dependent variable was the number of trials to reach the learning criterion. Faster learning meant fewer trials. The headline finding, the one that has propagated for more than a century, is that habit-formation was fastest at intermediate shock intensities. At very weak shock intensities, learning was slow — the punishment was not strong enough to drive a clean discrimination. At very strong shock intensities, learning was also slow — the punishment was disorganizing rather than instructive. The fastest learning occurred in the middle.
That much is the modern textbook summary. The actual paper is somewhat more interesting and somewhat narrower than the textbook treatment.
First, the discrimination task was varied along a difficulty dimension. Yerkes and Dodson did not just test one task at three shock intensities — they tested an easier discrimination (one bright compartment vs. one quite dark one) and a harder discrimination (one compartment slightly brighter than the other). They found that the optimal shock intensity depended on the difficulty of the discrimination. For the easy task, the optimal shock was higher; for the hard task, the optimal shock was lower. This is the kernel of what later writers would call “the Yerkes-Dodson Law” in its two-part form: there is an inverted-U relationship between arousal and performance, and the location of the peak shifts left as task difficulty increases. The harder the task, the lower the optimal arousal.
Second, the language of the paper is much more cautious than the later popularization. Yerkes and Dodson described their findings as suggesting a relationship between stimulus intensity and rate of habit-formation in mice on visual discrimination tasks. They did not propose a universal law. They did not extrapolate to humans. They did not use the words “arousal,” “stress,” or “performance” in the modern sense. The word “arousal” as a general psychological construct didn’t even exist in 1908 in the form that later writers would deploy. The 1908 paper is about electric shock as a punishing stimulus, mouse learning of a specific perceptual discrimination, and a specific operational measure of learning rate.
Third, the data behind the headline are thinner than the textbook makes it sound. The samples are small — fewer than ten mice per condition in some sub-experiments. The effect sizes, as best they can be reconstructed from the paper’s reporting standards, are moderate. The “inverted U” shape that everyone draws is in the data but it is not a clean parabola, and several of the comparisons that drive the interpretation rest on small differences between groups with substantial within-group variation. By the methodological standards of 2026, this would be a modest, suggestive, single-paper finding, not a “law.”
This last point matters. There is a temptation, when invoking the Yerkes-Dodson Law, to imagine that it is grounded in a robust experimental literature analogous to the one underlying the speed of light or Boyle’s gas law. It is not. It is grounded in one 1908 paper of fewer than thirty pages, about forty mice, under one very specific punishment regime, on one very specific perceptual discrimination task. Whatever survives of the “law” today survives because later researchers, working on entirely different problems, found patterns resembling the 1908 results in their own domains. The empirical chain from “shocked mice in a discrimination chamber” to “your sales team’s quarterly performance” is much longer than the whiteboard implies.
Teigen 1994: How “The Law” Was Assembled
The single most important paper for understanding how the modern Yerkes-Dodson Law came to exist is Karl Halvor Teigen’s 1994 review, “Yerkes-Dodson: A law for all seasons,” published in Theory & Psychology, volume 4, issue 4, pages 525 to 547. Teigen, a Norwegian psychologist with a longstanding interest in the history and rhetoric of psychological claims, went back to the 1908 paper and traced the genealogy of citations forward through the twentieth century. What he found is the kind of thing every working psychologist should encounter once early in their career: a “law” that was substantially built by later researchers reinterpreting and re-citing a much narrower original.
Teigen documents several distinct moves in the construction of the modern law.
The first move, made gradually across mid-twentieth-century reviews, was the generalization from “punishment intensity” to “drive” and eventually to “arousal.” The 1908 paper described a stimulus that punished incorrect responses. By the 1950s, in the heyday of Hull-Spence drive theory, this had been reinterpreted as a study of how drive level affected learning. By the 1960s, with the rise of activation theory and the concept of generalized cortical arousal, “drive” had been reinterpreted as “arousal.” Each step was reasonable in isolation. The cumulative effect was that a study about electric shock had become a study about a much broader and fuzzier psychological construct — and a construct that mapped onto the human experience of stress, anxiety, and excitement in ways the original study did not.
The second move was the inflation of the curve’s universality. The 1908 finding was about a specific perceptual discrimination task in mice. Later writers extracted the shape — inverted U, with optimal peak shifting with difficulty — and proposed that this shape held generally across tasks, species, and contexts. Teigen documents how, by the 1980s, the inverted U was being routinely applied to chess, athletic performance, public speaking, exam performance, surgical decision-making, and creative work — domains that share essentially nothing with brightness discrimination in shocked mice. The inference was that there is something deeply general about the relationship between arousal and performance, and that the 1908 study had detected an early instance of this general law. The original authors did not make this claim.
The third move was the codification of “the law” in textbook form. Once a finding has been reframed as a named law and printed in a textbook, the citation chain calcifies. Successive textbook authors cite the previous textbook, not the original 1908 paper. Teigen identified citation patterns where the actual 1908 paper had not been read by the authors writing about it for several intermediate generations. The “Yerkes-Dodson Law” took on the canonical form — inverted U, peak shifts with difficulty, applies universally — that you now see on whiteboards. The fact that this form was substantially abstracted from the source rather than directly derived from it became invisible.
The fourth move was the emergence of applied uses that the original paper could not possibly have grounded. By the late twentieth century the Yerkes-Dodson Law was being invoked in sports psychology, in test anxiety research, in workplace stress models, in military training design, and in management literature on quota structure and performance pressure. In each case the rhetorical pattern was similar: cite the law, draw the curve, locate the relevant practical question on the curve, derive a practical recommendation. The chain from 1908 mouse data to “your quarterly sales targets” was simply assumed to be sound. Teigen’s contribution was to point out that the chain had been built one step at a time, with each step plausible, but the cumulative inferential distance was enormous and unjustified.
Teigen’s most pointed observation is that “the Yerkes-Dodson Law” as it now exists is less a single empirical finding than a particular shape — the inverted U — that has been used to organize a wide variety of subsequent results. When later researchers found an inverted-U pattern in some new domain (test anxiety and exam scores, say, or competitive arousal and athletic performance) they cited Yerkes-Dodson as the foundational antecedent. When they found a different pattern, they tended either to ignore the law or to invoke “moderating variables” that left the underlying curve formally intact. The result is a “law” that survives in part because its scope has been quietly defined to include only the cases that support it. Teigen’s title — “a law for all seasons” — is sardonic. The law has been applied to so many things, with so many qualifications, that it has become unfalsifiable in practice.
This is not the kind of debunking that produces a clean cancellation of the underlying finding. Teigen does not argue that the inverted U is wrong. He argues that the construction of the law is much more contingent and much more rhetorical than its textbook treatment suggests, and that the strong universal version of the law — the one used to justify “optimal stress” interventions — is not what the original 1908 paper, or the careful subsequent literature, actually supports.
Where The Inverted U Actually Shows Up
The empirical question worth distinguishing from the historical question is: when researchers look carefully, in well-controlled settings, do they find inverted-U relationships between arousal and performance?
The answer is “sometimes, in some contexts, with substantial qualifications.” There are several domains where the pattern is reasonably well-documented.
Test anxiety and exam performance. A meaningful body of research, starting with Sarason and colleagues in the 1950s and 1960s and continuing through more recent work, has found a roughly inverted-U relationship between state anxiety and performance on cognitively demanding tasks like exams. Too little arousal, and motivation and attention are insufficient; too much, and working memory is consumed by anxiety processes that crowd out task performance. This is one of the cleaner real-world domains where the pattern shows up.
Athletic performance. Some sports psychology research has documented inverted-U or related patterns between competitive arousal and athletic performance, particularly in skilled fine-motor sports (gymnastics, golf, archery) where excess arousal disrupts precision. The literature is messier than the test-anxiety literature, and individual differences are large — some athletes perform best under high arousal, others under moderate, others under low — but in aggregate there is a real pattern.
Memory consolidation under emotional stress. A 2007 paper by David Diamond and colleagues in Neural Plasticity, “The temporal dynamics model of emotional memory processing,” reviewed the substantial literature on how acute stress affects memory consolidation in animals and humans. They documented an inverted-U relationship between stress hormone levels and several types of memory performance, especially memory consolidation in the hippocampus. This is one of the more neurobiologically grounded versions of the inverted U, with measurable hormonal correlates rather than ratings on a self-report arousal scale.
Some perceptual and motor learning tasks. In animal learning studies that more closely parallel the original Yerkes-Dodson design, intermediate aversive stimulation often produces faster learning than either very weak or very strong stimulation. The pattern is not universal across species or tasks, but in the original kind of paradigm — punishment-based perceptual discrimination — the basic shape is reasonably reproducible.
But the cases where the pattern doesn’t cleanly appear are equally important to know about.
Complex cognitive performance in real workplaces. When researchers try to find inverted-U relationships between workplace stressors and job performance, the results are much messier. There is plenty of evidence that very high chronic stress degrades performance, that burnout reduces output, and that some baseline level of engagement and challenge improves performance. There is much less evidence that the relationship traces a clean inverted U, that there is an identifiable “optimal stress level” for a given role, or that managers can usefully calibrate pressure to a knowable peak.
Creative and exploratory work. Inverted-U models do not fit creative output well. Creative performance shows strong sensitivity to mood states, autonomy, and time pressure, but the relationships are not well-described by a single curve in arousal space. Several lines of research suggest that low to moderate arousal supports divergent thinking better than high arousal, with no clean peak in the middle.
Habituation and chronic stress. The original Yerkes-Dodson framework concerns acute stimulation effects. Most workplace stress is chronic, and chronic stress effects are not well-modeled by a static inverted U. The neuroendocrine, immunological, and cognitive consequences of long-running stress accumulate in ways that have no peak — they degrade performance progressively with little evidence of an early-stage optimum.
Individual variation. Even in the domains where an inverted U appears in aggregate, individual differences in optimal arousal are very large. Some people perform best under what for others would be debilitating pressure. Some perform best under what for others would be sedating calm. The “right” point on the curve, if there even is a single one, varies across persons by amounts that are operationally comparable to the within-person curvature itself. This makes applied prescriptions of the form “operate at moderate arousal for best performance” much weaker than they sound.
The summary is that the inverted U is a real pattern in some well-defined experimental contexts, not a universal law, and the more clearly the supposed domain departs from the original 1908 paradigm — toward chronic stress, complex cognition, real-world performance, individual variation — the less the inverted U fits.
A useful supplementary paper here is Yaniv Hanoch and Oliver Vitouch’s 2004 review in Theory & Psychology, “When less is more: Information, emotional arousal and the ecological reframing of the Yerkes-Dodson law,” volume 14, issue 4, pages 427 to 452. Hanoch and Vitouch argue that the inverted-U pattern survives best when reframed in ecological-rationality terms — as a relationship between information processing capacity and the demands of a task — rather than as a literal arousal-performance curve. Their treatment is one of the better attempts in the recent literature to take what is durable from the Yerkes-Dodson tradition and place it on firmer theoretical ground.
What Gets Built On Top — Productivity, Sports, Test Prep
The applied uses of the Yerkes-Dodson Law extend in several directions, with varying degrees of empirical support.
Workplace “optimal stress” frameworks. A substantial coaching and consulting category — sometimes branded as “performance management,” sometimes as “stress for success,” sometimes as “challenge zone” thinking — claims to operationalize the Yerkes-Dodson curve by calibrating workplace pressure to a worker’s optimal arousal point. The pattern is to assess where on the curve a person currently sits and adjust workload, deadline pressure, or stretch-goal aggressiveness to move them toward the peak. The intuition is reasonable. The evidence that this can be done reliably, in real organizational settings, at scale, is essentially nonexistent. Most of these frameworks rest on a chain of inference from 1908 mouse data through “general arousal” theory through “optimal challenge” rhetoric, with no specific empirical grounding for the intervention as actually delivered.
Sports “performance zone” coaching. The Yerkes-Dodson framework, often combined with Csikszentmihalyi’s flow construct, underwrites a large portion of sports psychology coaching aimed at helping athletes reach the “zone.” The underlying research on arousal and athletic performance does support some pattern in this direction, particularly for fine-motor sports, but the prescriptive precision of the coaching (“you need to be at arousal level X for your event”) substantially overruns the evidence. The genuine clinical contribution of sports psychology to athletes — anxiety management, mental preparation, pre-performance routines — does not require the strong version of the law to be true.
Test-prep “optimal anxiety” interventions. Test anxiety is one of the cleaner domains for the inverted-U pattern, and there is real empirical grounding for the claim that some moderate level of activation is more useful for exam performance than either flatness or panic. Test-prep programs that include arousal-management components (breathing, cognitive reframing of anxiety, pre-test rituals) are working in a domain where the underlying research provides reasonable support. The framing that there is a precise optimal level to engineer toward is, again, overconfident relative to the evidence — but the broad direction of “manage anxiety to avoid the high-arousal degradation region” is empirically defensible.
Productivity literature on “deep work” and “stress for performance.” A genre of productivity writing aimed at knowledge workers invokes the Yerkes-Dodson curve to argue for engineered challenge, time pressure, or stretch goals as performance enhancers. The chain of inference here is even longer than in the workplace optimal-stress case, and the empirical grounding even weaker. The actual evidence on knowledge-worker productivity and time pressure is mixed and dependent on task type, individual, and chronic context. Citing Yerkes-Dodson in this domain is mostly decorative.
Burnout and the “right side of the curve.” The downward right-side of the inverted U has been invoked as a model for burnout — pushed past the optimum, performance falls. This framing has some intuitive appeal but does not map cleanly onto the actual neurobiology of burnout, which involves chronic stress effects, recovery deficits, and motivational and affective changes that are not well-described by a single arousal-performance curve. Treating burnout as “you went past the peak” understates how much of burnout is about sustained mismatch between demands and recovery rather than about acute over-arousal.
The general pattern across these applied uses is that the broad qualitative intuition — that both too little and too much pressure can degrade performance — is reasonable and survives. The specific quantitative claims about optimal arousal levels, the precision with which curves can be calibrated for individuals, and the universal applicability across task types and chronic versus acute contexts are not well-supported and should be treated with skepticism.
What Is Honest To Say About Yerkes-Dodson
Strip away the popularization and a careful summary of the empirical record looks something like this.
In 1908, Yerkes and Dodson published a single study showing that mouse learning of a brightness discrimination task was fastest at intermediate electric-shock intensities, with the optimal intensity shifting lower for harder discriminations. The study was modest, the sample was small, and the language was cautious. The “Yerkes-Dodson Law” as a universal claim about arousal and performance was constructed over the subsequent century by later researchers reinterpreting and generalizing the original finding. Karl Halvor Teigen’s 1994 review traces this construction in detail.
The inverted-U pattern is real in some well-controlled experimental contexts — test anxiety and exam performance, memory consolidation under acute stress, some skilled motor tasks, some perceptual learning paradigms. It is not universal. The further the supposed application departs from the original kind of paradigm (toward chronic stress, complex cognition, real-world workplace performance, large individual variation), the worse the fit.
The broad qualitative intuition that both too little and too much pressure can degrade performance is defensible. The specific quantitative claim that there is a knowable optimal arousal level for each task and individual, which can be calibrated through training or workplace design, is not well-supported. The “performance zone” or “optimal stress” interventions built on the strong version of the law substantially exceed what the underlying evidence justifies.
The law survives in textbooks and on whiteboards in part because its shape is intuitive and its scope has been quietly defined to include only the cases that support it. In careful empirical work, the pattern is one useful heuristic among several, not a universal regulator of performance.
That is a more nuanced summary than “Yerkes-Dodson proves you need optimal stress.” It is also closer to true.
What This Means For Strategists Evaluating “Optimal Stress” Or “Performance Zone” Pitches
For CEOs, COOs, chiefs of staff, and HR leaders evaluating a vendor pitch, consultant framework, or internal initiative that invokes the Yerkes-Dodson Law, the recalibrated picture has several practical implications.
Citing “Yerkes-Dodson” is a soft yellow flag, not because the underlying finding is fake but because the people who lean on the name most heavily are usually working from the popular version rather than the source. A program designer grounded in the careful literature will not lead with “the Yerkes-Dodson Law proves.” They will talk about anxiety management, recovery design, task-difficulty matching, or whatever the actual mechanism is in their domain, with the historical citation as decoration rather than load-bearing structure. Vendors who treat the law as proof of a specific intervention should be asked which empirical literature, in their specific domain, supports the specific intervention they are selling. The answer is rarely as clean as the pitch suggests.
Promises of a calibratable “optimal stress level” for individuals or teams are overconfident. Individual variation in optimal arousal is large enough that the prescriptive precision of “your team should operate at moderate-to-high arousal” is not operationally meaningful. Programs that promise to identify and maintain a worker’s specific optimum, especially through any kind of self-report or wearable monitoring, are extrapolating well beyond the evidence. The narrower claim — “managing acute test anxiety helps test performance,” “fine-motor athletic skill degrades under very high competitive arousal” — is defensible. The broader claim — “we can keep your knowledge workers at their performance peak” — is not.
The chronic-versus-acute distinction matters enormously. The Yerkes-Dodson tradition is fundamentally about acute arousal effects on immediate task performance. Most real workplace performance is shaped by chronic stress and recovery patterns, not acute arousal. Programs that invoke the inverted U to justify ongoing high-pressure environments (“we keep our people on the right side of the peak”) are misapplying an acute-effect framework to a chronic-effect domain. The actual literature on chronic workplace stress points to recovery, sleep, autonomy, and demand-control balance, not to “calibrate the pressure to optimum.”
The “stretch goal” justification often runs through the law without grounding. A common move in operating-cadence design is to invoke the Yerkes-Dodson curve as evidence that stretch goals (or aggressive quotas, or compressed deadlines) improve performance. The actual evidence on stretch goals is mixed and dependent on context, individual, and recovery design. Citing Yerkes-Dodson in this argument is mostly rhetorical — the underlying study does not bear on the specific question of whether quarterly quota structure improves enterprise sales performance, and treating it as if it does is a form of pseudo-empirical justification for a managerial preference.
Burnout cannot be solved by “moving back to the optimum.” Framing burnout as “you went past the peak of the curve” suggests that the fix is to dial pressure back to the previous optimum. This understates the cumulative, recovery-dependent, neurobiologically distinct character of burnout. Programs that treat burnout as a calibration problem on a single performance curve will mostly miss the underlying issue. The actual evidence-based interventions for burnout look more like restored autonomy, demand reduction, recovery time, and (in serious cases) clinical treatment, not “tune your team’s arousal back to optimum.”
Use the qualitative intuition; do not import the quantitative claims. The defensible takeaway from the Yerkes-Dodson tradition is qualitative: both flatness and panic tend to degrade performance, anxiety management matters, very high acute arousal often disrupts fine cognition and fine motor control. Programs that operationalize that qualitative intuition with reasonable, evidence-grounded interventions (cognitive reframing of pre-performance anxiety, recovery design, demand-control balance, expertise-appropriate task difficulty) are doing useful work. Programs that operationalize it with strong quantitative claims about specific optimal arousal levels are overpromising.
The deeper move, behind all these specific checks: the Yerkes-Dodson Law is best treated as a useful heuristic frame, not as a calibratable scientific law. It tells you that the relationship between arousal and performance is not monotonic, that increasing pressure indefinitely does not improve outcomes, and that some attention to managing arousal is sensible. It does not tell you where the peak is, that it is the same across people, that it is stable over time, that it applies to chronic conditions, or that you can engineer your team to live on it. Vendors who treat it as the second kind of thing are working from the textbook version rather than the source.
Sources
Primary research and historical reconstruction:
- Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habit-formation. Journal of Comparative Neurology and Psychology, 18(5), 459–482. https://doi.org/10.1002/cne.920180503
- Teigen, K. H. (1994). Yerkes-Dodson: A law for all seasons. Theory & Psychology, 4(4), 525–547. https://doi.org/10.1177/0959354394044004
- Hanoch, Y., & Vitouch, O. (2004). When less is more: Information, emotional arousal and the ecological reframing of the Yerkes-Dodson law. Theory & Psychology, 14(4), 427–452. https://doi.org/10.1177/0959354304044918
- Diamond, D. M., Campbell, A. M., Park, C. R., Halonen, J., & Zoladz, P. R. (2007). The temporal dynamics model of emotional memory processing: A synthesis on the neurobiological basis of stress-induced amnesia, flashbulb and traumatic memories, and the Yerkes-Dodson Law. Neural Plasticity, 2007, 60803. https://doi.org/10.1155/2007/60803
Secondary and contextual sources:
- Brown, W. (1929). The Yerkes-Dodson Law repealed. Psychological Review, 36(1), 75–80. An early critical commentary documenting that the universalizing claim was contested from within psychology long before the modern revisions.
- Anderson, K. J., Revelle, W., & Lynch, M. J. (1989). Caffeine, impulsivity, and memory scanning: A comparison of two explanations for the Yerkes-Dodson effect. Motivation and Emotion, 13(1), 1–20. One of the better attempts to dissect the mechanism behind apparent inverted-U effects in human cognitive performance.
Related
- Replication Crisis Hub — the full set of recalibrations on widely-cited psychology findings
- Ego Depletion: From “Willpower Is A Limited Resource” To A Cautionary Tale — another high-confidence applied claim that has been substantially revised
- Csikszentmihalyi’s Flow: What Survives Of The “Optimal Experience” Construct — the construct most often paired with Yerkes-Dodson in “performance zone” coaching
- Fredrickson’s 3:1 Positivity Ratio: How A Single Number Made It Into Leadership Training — another case of a precise-sounding number outrunning its empirical base
- The Marshmallow Test — what happens when the original effect shrinks substantially on closer examination
- Mehrabian’s 7-38-55 Rule — another case where a narrow original study was generalized into a universal “law” the original researcher disowned
FAQ
Is the Yerkes-Dodson Law just wrong?
No. The 1908 study really did find that intermediate stimulus intensities produced faster mouse learning than very low or very high intensities, and the inverted-U pattern has been observed in a number of subsequent contexts (test anxiety, memory consolidation under stress, some skilled motor tasks). The critique is not that the law is fabricated. The critique is that the broad universal version — applying to all tasks, all people, all chronic and acute conditions, with an identifiable optimal arousal level for each — is much stronger than what the original study or the careful subsequent literature actually supports. The honest summary is “real pattern in some contexts, oversold as a universal law.”
Should I trust a coach or consultant who invokes the Yerkes-Dodson Law?
It depends on what they are doing with it. If they use it as a qualitative frame — “both flatness and panic tend to degrade performance, so let’s design recovery and challenge sensibly” — they are using the science reasonably. If they use it as the empirical basis for a specific intervention — “we will identify your optimal arousal level and engineer your environment to keep you there” — they are extrapolating well beyond the evidence. The test is whether they treat the law as a heuristic frame or as a calibratable scientific instrument. The first is defensible; the second is not.
What about the inverted U in sports performance? Does that actually work?
There is real evidence that acute competitive arousal interacts with athletic performance in ways that resemble an inverted U, particularly in skilled fine-motor sports. The aggregate pattern is real. What is not well-supported is the precision with which individual athletes can be coached to an “optimal arousal level.” Individual variation is very large, and the prescriptive specificity of “you should be at arousal level X for your event” overruns the evidence. Sports psychology interventions on pre-performance routines, anxiety management, and mental preparation have genuine clinical value that does not require the strong version of the law.
Does this mean stretch goals don’t work?
The Yerkes-Dodson Law is not a useful empirical basis for arguments about stretch-goal design either way. The actual evidence on stretch goals is mixed, depends heavily on context, individual disposition, and recovery infrastructure, and does not trace a clean inverted U. Citing Yerkes-Dodson in defense of (or against) stretch goals is mostly rhetorical. The honest treatment is to evaluate the specific stretch-goal literature in your specific domain, not to derive an answer from a 1908 mouse-learning study.
What about flow state? Is that related to Yerkes-Dodson?
Csikszentmihalyi’s flow construct and the Yerkes-Dodson framework are often paired in performance coaching, but they are quite different in their empirical bases. Flow is a phenomenology of subjective experience under conditions of skill-challenge match; Yerkes-Dodson is (originally) about objective performance as a function of arousal intensity. The two have been combined into a “performance zone” narrative in popular usage, but neither construct alone, and certainly not the combination, supports the strong prescriptive claims that “performance zone” coaching often makes. Flow has its own replication and definitional issues, addressed in the separate hub entry.
Is Teigen’s 1994 paper still the best historical treatment?
It is still the most cited and arguably the most pointed. Hanoch and Vitouch’s 2004 paper provides a useful follow-up that proposes an ecological-rationality reframing of what survives of the law. The Diamond et al. 2007 paper in Neural Plasticity is the most substantial recent neurobiological treatment of inverted-U patterns in stress-and-memory contexts. Together these three papers give a reasonably complete picture: Teigen for the historical critique, Hanoch and Vitouch for the careful theoretical reframing, Diamond et al. for what the modern neuroscience actually supports.
How does this apply to test anxiety, where there’s supposedly a real inverted U?
Test anxiety is one of the cleaner domains for an inverted-U pattern, and the broad finding — that some activation supports performance but very high anxiety degrades it — is reasonably well-supported. The practical implication is that managing pre-test anxiety (through cognitive reframing, breathing techniques, preparation rituals) is empirically defensible. What is not supported is the stronger claim that there is a precise optimal anxiety level to engineer toward. The useful version is “manage high anxiety to avoid the high-arousal degradation region.” The overreaching version is “calibrate to your personal peak.”
Should organizations care about the Yerkes-Dodson Law at all?
The qualitative intuition — that both flatness and panic degrade performance, that demands and recovery need to be designed together, that anxiety management matters — is worth taking seriously, and Yerkes-Dodson is a reasonable historical reference point for it. What is not worth taking seriously is the strong prescriptive use of the law to justify specific interventions, specific quota designs, specific stress-engineering programs, or specific claims about “optimal arousal levels.” Organizations should care about the underlying questions (demand-control balance, recovery, anxiety management, task-difficulty matching) and treat the Yerkes-Dodson curve as one rhetorical frame among several, not as scientific authority for any particular practice.