After Miller and Sanjurjo 2018 reversed the hot hand fallacy, some readers asked whether the gambler’s fallacy --- its apparent inverse --- was also a methodological artifact. It is not. Across half a century of lab work, six hours of casino surveillance video, decades of lottery purchase records, and the sequential decisions of asylum judges, loan officers, and Major League Baseball umpires, the gambler’s fallacy holds up. The mechanism is well-understood. The mitigation strategies work. And the size of the distortion it introduces into real-world sequential decisions is large enough that any organization with a hiring panel, a deal-approval committee, or a promotion process should treat it as a calibration problem worth solving.
If you have been reading through this hub, you have watched a long parade of canonical behavioral-science findings get dismantled. The hot hand fallacy --- the textbook example of “people see patterns where none exist” --- was itself reversed in 2018 by a subtle bias in the standard methodology. The natural next question, and one I have been asked directly by readers, is whether the gambler’s fallacy is on similarly thin ice. After all, the two effects are often presented as a matched pair: the hot hand says streaks continue, the gambler’s fallacy says streaks reverse, and both are framed as failures of subjective randomness. If one falls, why not the other?
The answer is that the two phenomena are not symmetric. The hot hand literature was about whether observed sequences in finite shooter data contained more clustering than chance would produce --- a statistical question about data that turned out to have a methodological bug in its standard test. The gambler’s fallacy literature is about whether people expect future outcomes to compensate for past ones --- a behavioral question about choices and bets that can be tested in a dozen ways, in lab and field, across populations and stakes. The empirical record on the gambler’s fallacy is one of the most robust in all of cognitive psychology, and it has only gotten stronger with the addition of high-stakes field data in the 2010s.
This is the second anti-example article in a hub full of takedowns. Like the defaults article, it exists for calibration. Readers should leave the hub knowing that some cognitive biases are robust, well-measured, and consequential --- and the gambler’s fallacy is the cleanest example of one. For strategists evaluating sequential decision processes inside their own organizations, the implication is direct: the same bias that makes roulette players bet against the streak makes hiring panels reject the fifth strong candidate after they have approved four in a row, and the same statistical mistake distorts loan approvals, deal screens, and promotion votes. The fix is not awareness; the fix is structural.
What The Gambler’s Fallacy Actually Is
The gambler’s fallacy is the belief that, in a sequence of independent random outcomes, past outcomes affect the probability of future outcomes in a compensatory direction. After a roulette wheel comes up red five times in a row, the gambler’s-fallacy bettor believes black is “due.” After a coin lands heads four times, the gambler’s-fallacy bettor expects tails on the fifth flip with probability higher than 0.5. The mathematical reality is the inverse: the wheel has no memory, the coin has no memory, and the probability of each independent trial is exactly what it has always been. The previous outcomes are informationally irrelevant.
The fallacy is distinct from a related-but-correct intuition about non-independent processes. If you draw cards from a deck without replacement, then yes, drawing four red cards in a row does shift the probability of the next card being red downward, because the population has actually changed. The gambler’s fallacy is specifically the misapplication of that intuition to processes that are independent --- roulette wheels, dice, coins, slot machines, and (as we will see) the underlying merit distribution of cases coming before a judge or applicants coming before a hiring panel.
The error is sometimes framed as “expecting small samples to look like the long-run distribution.” If a fair coin’s long-run probability of heads is 0.5, the gambler’s-fallacy reasoner expects any short run --- ten flips, twenty flips --- to roughly approximate that 0.5. So if the first six flips of a ten-flip sequence have been heads, the fallacy reasoner expects the remaining four flips to lean tails to “balance” the sequence. The mathematical reality is that the remaining four flips will average roughly 0.5 heads regardless of what came before, and the full ten-flip sample will simply be six-plus-two equals eight heads in expectation, not five heads.
The phenomenon was first systematized as part of the early-1960s probability-perception literature, but it found its canonical theoretical home in the Tversky-Kahneman research program on representativeness and the law of small numbers.
Tversky & Kahneman 1971 --- “Belief In The Law Of Small Numbers”
The foundational theoretical framing is Tversky, A., & Kahneman, D. (1971). “Belief in the Law of Small Numbers.” Psychological Bulletin, 76(2), 105—110. DOI: 10.1037/h0031322.
The paper opens with a deliberately mischievous demonstration. Tversky and Kahneman surveyed a population of professional research psychologists --- experienced statisticians, in other words, who teach probability to their graduate students --- and asked them to make a series of judgments about small-sample research scenarios. Would a result that just barely reached significance in a sample of 20 likely replicate in a sample of 10? How confident should one be in a striking effect observed in a single small experiment? What sample size is needed to reliably detect an effect of a given magnitude?
The psychologists’ answers were systematically wrong in a single direction. They consistently overestimated the reliability of small-sample results, expected significant effects to replicate at rates that no statistical theory would predict, and behaved, in their actual research decisions, as if a small sample were a “scaled-down” version of a large one --- noisy but qualitatively similar in its tendencies. This was, of course, exactly the population that should have known better. The contrast between what they would have told their students and what they actually believed when they were doing the survey was sharp enough to be funny.
Tversky and Kahneman’s theoretical contribution was to frame this not as a one-off result about psychologists but as a manifestation of a deeper bias in how people reason about random processes. The “law of small numbers” --- the implicit belief that small samples should look like the populations they came from --- was, they argued, the same cognitive ingredient driving the gambler’s fallacy, the hot hand fallacy, and a wider class of representativeness errors. The probability literacy of the reasoner did not matter much. Even trained statisticians fell into the trap when the question was posed in a way that engaged their intuitions rather than their formal training.
For the gambler’s fallacy specifically, the implication is direct. If you implicitly believe that a small run of coin flips should look “like a fair coin” --- meaning roughly balanced between heads and tails --- then a run that has been imbalanced so far (say, five heads in a row) will trigger an expectation that the next flips should compensate. The gambler’s fallacy is, in this framing, not a separate bias but a consequence of the law-of-small-numbers expectation applied to an in-progress sequence.
The 1971 paper is short --- six pages --- and the math is informal. The follow-up that built out the broader theoretical apparatus is Tversky, A., & Kahneman, D. (1974). “Judgment Under Uncertainty: Heuristics and Biases.” Science, 185(4157), 1124—1131. DOI: 10.1126/science.185.4157.1124. That paper introduced the language of “representativeness” as the heuristic doing the work --- people judge the probability of an event by how representative it seems of the underlying process, and a small streak feels unrepresentative of a random process, so they expect compensation. The 1974 paper became one of the most-cited works in the entire behavioral-economics canon, and the gambler’s fallacy is one of the load-bearing examples it uses.
Lab Evidence --- Tune 1964, Burns & Corpus 2004, Ayton & Fischer 2004
The pre-Tversky-Kahneman lab literature was already substantial. The early canonical reference is Tune, G. S. (1964). “Response Preferences: A Review of Some Relevant Literature.” Psychological Bulletin, 61(4), 286—302. DOI: 10.1037/h0048618. Tune reviewed dozens of laboratory studies in which participants were asked to predict or produce random sequences --- guessing which light would flash next, predicting the next digit in a “random” series, or generating sequences they believed were random. The pattern was overwhelmingly consistent. Participants systematically over-alternated. Asked to produce a “random” string of binary outcomes, they generated strings with far more switches between the two options than chance would produce. Asked to predict the next outcome after a short run, they systematically predicted reversal. The behavior was robust across stimuli, populations, and task formats.
The Tune review is important because it predates the heuristics-and-biases program by a decade. The behavioral phenomenon existed in the literature long before there was a theoretical apparatus to explain it. It was a sturdy, repeated empirical observation in search of a mechanism.
Modern lab work has refined the picture considerably. Burns, B. D., & Corpus, B. (2004). “Randomness and Inductions from Streaks: ‘Gambler’s Fallacy’ versus ‘Hot Hand.’” Psychonomic Bulletin & Review, 11(1), 179—184. DOI: 10.3758/BF03206480. asked the obvious question that anyone reading this hub will already be asking: if people expect streaks to reverse (gambler’s fallacy), why do they also expect streaks to continue (hot hand)? Burns and Corpus’s answer was that the controlling variable is what the participant believes about the underlying process. When participants believe the process is genuinely random --- a roulette wheel, a coin --- they apply the gambler’s-fallacy expectation and bet against the streak. When participants believe the process has a controllable or skill-based component --- a basketball player, a stock-picker --- they apply the hot-hand expectation and bet with the streak. The two intuitions are coherent within a participant; they just point in opposite directions for different perceived data-generating processes.
The complementary paper is Ayton, P., & Fischer, I. (2004). “The Hot Hand Fallacy and the Gambler’s Fallacy: Two Faces of Subjective Randomness?” Memory & Cognition, 32(8), 1369—1378. DOI: 10.3758/BF03206327. Ayton and Fischer reach a similar conclusion through a different route: they argue that the gambler’s fallacy is the natural product of repeated experience with sequences of independent events (where alternation is the local pattern), while the hot-hand expectation is the natural product of repeated experience with sequences of skill-based outcomes (where streaks really do reflect underlying state). Both are functional adaptations to the structure of the environment; neither is an arbitrary glitch. Both papers, taken together, place the gambler’s fallacy on much firmer ground than a simple “people are bad at randomness” framing would suggest --- it is a specific, predictable response to a specific class of perceived processes.
Casino Data (Croson & Sundali 2005) --- Real-World Roulette Betting
The lab work would not be sufficient on its own to call the gambler’s fallacy a robust finding. Lab tasks involve hypothetical stakes, undergraduate participants, and stimuli that may not engage the same cognitive machinery as real-world betting. The most important empirical test was the move from lab to field.
The canonical field demonstration is Croson, R., & Sundali, J. (2005). “The Gambler’s Fallacy and the Hot Hand: Empirical Data from Casinos.” Journal of Risk and Uncertainty, 30(3), 195—209. DOI: 10.1007/s11166-005-1153-2.
Croson and Sundali obtained access to roulette-table surveillance footage from a large Reno casino. They analyzed 18 hours of tape covering a single roulette table over three six-hour blocks across three days in July 1998. They coded every bet placed on the red/black even-money proposition (the simplest binary outcome on the table), along with the recent history of outcomes on the wheel. The dataset is small by modern standards --- roughly 139 bettors and several thousand individual bets --- but the data are real-world, real-money, real-stakes wagers placed by self-selected gamblers in a live casino.
The pattern they found was a textbook gambler’s fallacy. Bets on red after a run of reds declined as the run got longer, while bets on black after a run of reds increased. The shift was monotonic and substantial. After six reds in a row, the proportion of subsequent red bets had fallen sharply relative to the baseline. The pattern was reversed, by symmetry, after runs of black. Players were systematically betting against the recent streak, exactly as the gambler’s-fallacy hypothesis predicts.
The most-cited methodological nuance of the Croson-Sundali paper is that they also found evidence of the hot-hand fallacy --- but on a different variable. Players who had won several bets in a row increased their wager sizes on subsequent bets, as if they believed they had become more skilled or lucky. The same population, in other words, simultaneously displayed the gambler’s fallacy when reasoning about the wheel (it must come up the other color soon) and the hot-hand fallacy when reasoning about themselves (I am on a hot streak). This finding lines up exactly with the Burns-Corpus and Ayton-Fischer theoretical framing: the wheel is perceived as a random process and triggers gambler’s-fallacy expectations, while the gambler’s own performance is perceived as skill-influenced and triggers hot-hand expectations.
For the purposes of this article, the relevant point is that the gambler’s-fallacy half of the Croson-Sundali finding is exactly what the lab literature predicted, at the scale of real money in a real casino. The behavior survives the transition from undergraduate convenience samples to self-selected, motivated, paying gamblers.
Judicial And Professional Decision-Making (Chen, Moskowitz & Shue 2016 QJE)
The Croson-Sundali paper made the gambler’s fallacy a real-world phenomenon. The paper that made it a consequential one for any institution that runs sequential decisions is Chen, D. L., Moskowitz, T. J., & Shue, K. (2016). “Decision Making Under the Gambler’s Fallacy: Evidence from Asylum Judges, Loan Officers, and Baseball Umpires.” Quarterly Journal of Economics, 131(3), 1181—1242. DOI: 10.1093/qje/qjw017.
The Chen-Moskowitz-Shue paper is, I think, the single most important piece of empirical work in the gambler’s fallacy literature, and the reason this article exists. The authors analyzed administrative data from three completely different professional decision-making contexts:
The asylum-judge dataset contained roughly 150,000 decisions by U.S. immigration judges over more than two decades. Each judge sees a sequence of cases, and each case has a binary outcome (grant or deny). Crucially, the order in which cases arrive at any given judge is effectively random with respect to the merits of the cases --- the docket assignment system is not designed to cluster strong or weak cases together. So under the null hypothesis that judges are making decisions on the merits, the probability of a grant on case N should be statistically independent of the outcome on case N-1.
The loan-officer dataset contained roughly 9,000 loan-application decisions by a single large lender. Same structure: each officer sees applications in sequence, the binary outcome is approve or deny, and the merits of consecutive applications are independent.
The MLB umpire dataset contained roughly 1.5 million pitch calls (strike or ball) by home-plate umpires across many seasons. The “merits” of a pitch --- whether it crossed the strike zone --- can be measured independently using the league’s pitch-tracking system, so the authors can test whether umpire calls are biased relative to the ground truth in a sequence-dependent way.
In all three datasets, Chen, Moskowitz, and Shue found the same pattern. The probability of a positive decision on case N (a grant, an approval, a called strike) was significantly lower if the previous case had received a positive decision, and significantly higher if the previous case had received a negative one. The autocorrelation in decisions was negative --- the opposite of what would arise from any “easy day / hard day” or judge-mood effect, and exactly what the gambler’s fallacy predicts. After approving the previous case, the decision-maker became unconsciously more likely to deny the next one, as if mentally “balancing the sequence.” The effect strengthened after longer streaks. After two grants in a row, the next denial was even more likely; after three, more likely still.
The authors estimated the magnitude carefully. For the asylum judges, the gambler’s fallacy was changing roughly 1 to 3 percentage points of decisions --- which sounds small until you remember the dataset contained 150,000 decisions and the underlying outcomes are about asylum, deportation, and human safety. The effect was stronger among more moderate and less experienced decision makers, weaker among the most experienced and most extreme decision makers, and stronger when consecutive cases shared characteristics (so the comparison felt more natural) and when the time between decisions was shorter. The loan-officer effect was of similar magnitude. The umpire effect was small per-pitch but consistent across millions of observations, and it produced systematically biased strike-zone calls that fans, players, and managers had been complaining about for years without knowing exactly why.
The Chen-Moskowitz-Shue paper is the load-bearing citation for the practical relevance of the gambler’s fallacy. It demonstrates the bias in three high-stakes, real-world settings, using administrative data with millions of observations, in domains where the decision-makers have professional training and explicit incentives to decide on the merits. The bias persists anyway. The methodology has not been seriously challenged in the decade since publication, and the paper has accumulated thousands of citations across economics, psychology, and law.
For any executive reading this hub, the operational implication is the one that should stick: every sequential-decision process in your organization is subject to this bias. Hiring panels evaluating candidates in a sequence. Deal-approval committees reviewing pipelines in batches. Promotion committees voting on candidates in order. Loan officers and underwriters making decisions on a queue. Quality-control inspectors evaluating products on a line. All of them, under the Chen-Moskowitz-Shue evidence, are exhibiting negative autocorrelation in their decisions that has nothing to do with the merits of the cases. The bias is real, measurable, and large enough to matter.
Lottery Number Selection (Clotfelter & Cook 1991)
A complementary piece of real-world evidence comes from the lottery literature. Clotfelter, C. T., & Cook, P. J. (1993). “The ‘Gambler’s Fallacy’ in Lottery Play.” Management Science, 39(12), 1521—1525. DOI: 10.1287/mnsc.39.12.1521. (The paper circulated as NBER Working Paper No. 3769 in 1991 before being formally published in 1993.)
Clotfelter and Cook analyzed three-digit pick-three lottery purchase records from the Maryland State Lottery. In pick-three games, players choose a three-digit number from 000 to 999, and a winning number is drawn at random; the expected payout is the same regardless of which number a player picks. From a pure-expected-value standpoint, the player’s number choice is irrelevant. From a behavioral standpoint, the choice is informative about the player’s mental model of randomness.
The pattern Clotfelter and Cook found was unmistakable. The dollar volume bet on any specific three-digit number fell sharply in the days immediately after that number was drawn as the winner, and only slowly recovered over the following months. Bettors were systematically avoiding numbers that had recently won, on the implicit theory that those numbers were “due to stay away” --- a textbook gambler’s-fallacy expectation applied to a process (the lottery draw) that is genuinely independent across days.
The economic implication is interesting in its own right. Because lottery payouts are split among all winners holding the drawn number, avoiding recently-drawn numbers is actually rational at the level of expected payout: if everyone else also avoids those numbers, betting on them increases your share of any prize, because fewer co-winners will split with you. But Clotfelter and Cook’s analysis showed that the avoidance was so strong that it produced excess expected returns from betting on recently-drawn numbers --- bettors were leaving money on the table by following their gambler’s-fallacy intuition past the point at which it was self-defeating.
The Clotfelter-Cook paper is small (it is a “Notes” section in Management Science) but it has stuck in the literature because the dataset is clean, the prediction is exact, and the behavioral pattern matches lab predictions to within a few percentage points. It is one more data point in a tightly-packed pattern.
How Gambler’s Fallacy Differs From Hot Hand
The hot hand fallacy and the gambler’s fallacy are routinely presented as a matched pair --- two faces of the same misunderstanding of randomness --- and that framing is theoretically convenient. It is also, as the Miller-Sanjurjo reversal of the hot hand established, importantly misleading about the empirical status of the two phenomena.
The hot hand “fallacy” was a claim about whether observed sequences in basketball shooting data contained more clustering than chance would produce. The claim was that they did not. The 2018 reversal showed that the standard test for that claim had a methodological bug --- the streak selection bias --- that produced apparent independence in data that actually contained meaningful clustering. The reversal applies to the academic-statistical test. The everyday intuition --- that basketball players sometimes get hot --- turned out to be more correct than the academic literature had been claiming for thirty years.
The gambler’s fallacy is a claim about whether people expect future outcomes to compensate for past ones in independent random processes. That claim has been tested in dozens of distinct ways:
- Sequence generation tasks (do people produce too-alternating “random” strings?) --- yes, consistently.
- Sequence prediction tasks (after a run, do people predict reversal?) --- yes, consistently.
- Casino betting on independent wheels (Croson-Sundali) --- yes, on real money.
- Lottery number selection (Clotfelter-Cook) --- yes, with economic cost.
- High-stakes professional decisions in legal, financial, and athletic contexts (Chen-Moskowitz-Shue) --- yes, with policy implications.
There is no methodology bug analogous to streak selection bias that could turn any of these findings around. The behavioral pattern is the data; you do not need a complex statistical test to see that lottery players avoid recently-drawn numbers or that asylum judges show negative autocorrelation in their grants. The mechanism (representativeness, law of small numbers) is well-specified and consistent across studies. The effect appears across populations (undergraduates, gamblers, judges, lottery players, umpires), stakes (lab points, casino chips, asylum claims, called strikes), and decision domains (binary predictions, betting, professional judgment).
The honest summary is that the gambler’s fallacy is in the same epistemic category as the default effect from the other anti-example article in this hub: it is a behavioral finding that has been measured at scale, across paradigms, with converging mechanisms, and has not been seriously challenged. The hot hand “fallacy” was in a different category --- a single statistical claim about a specific dataset, undone by a single bias correction. The two phenomena are not symmetric, and the demolition of one does not impeach the other.
Mitigation Strategies
Because the gambler’s fallacy is robust and the mechanism is understood, the menu of mitigation strategies is also fairly well established. None of them rely on telling decision-makers to “be aware of the bias” --- that intervention has been tested directly and does not work, because the bias operates below the threshold of explicit reasoning even for trained statisticians (as Tversky and Kahneman demonstrated in 1971).
Structural randomization. The cleanest fix is to remove the sequential structure that triggers the bias in the first place. When decisions need to be made in batches, randomize the order of cases or items before presentation, so that the decision-maker’s mental “running tally” of recent outcomes is not tracking any meaningful sequence. For hiring panels, this means resisting the natural tendency to evaluate candidates in interview order, and instead deliberating on the full set after all interviews are complete.
Forced random-number generation. In contexts where the bias matters most (lottery, monte-carlo simulation, randomized control assignment), use cryptographically random number generation rather than human selection. The pseudo-random sequences humans produce are systematically over-alternating, and any analysis that depends on actual independence will be corrupted.
Decision-aid software. Several research groups have built decision-support tools that explicitly flag negative autocorrelation in a decision-maker’s recent decisions and prompt for re-evaluation when it appears. The evidence is preliminary but encouraging: the auditing intervention does reduce the magnitude of the bias, even though awareness alone does not.
Statistical literacy training. Long-form training in probability --- not “here is the bias, watch out for it” but actually working through worked examples of independent versus dependent processes, computing conditional probabilities, and reasoning about small-sample variability --- reduces but does not eliminate the gambler’s fallacy in the trained population. The effect is real but smaller than the structural interventions.
Sequential-decision audits. For high-stakes processes (hiring, deal approval, asylum decisions, loan underwriting), retrospective analysis of decision sequences can detect gambler’s-fallacy patterns and feed back into process redesign. The Chen-Moskowitz-Shue methodology --- regressing decision N on decision N-1, N-2, etc. --- is reproducible and has been adapted for organizational audit use.
Forced-fraction prompts. A specific debiasing intervention that has shown reasonable effect sizes is to explicitly remind decision-makers of the expected base rate before each decision. “Roughly 30% of cases in this caseload are typically approved” prompts, delivered case-by-case, reduce the dependence of decision N on decision N-1 because the decision-maker is anchored to the long-run rate rather than the local running average.
The combination of structural randomization and base-rate reminders is the standard recommendation in the modern decision-architecture literature. Both interventions are cheap. Neither requires the decision-maker to become statistically sophisticated. Both have been shown to reduce the magnitude of the bias by meaningful fractions in field experiments.
What This Anti-Example Tells Us About Robust Cognitive Biases
The gambler’s fallacy survives the kind of scrutiny that has demolished most of the canonical findings in this hub for five reasons that are worth naming explicitly.
First, the effect is measured across multiple paradigms. It does not depend on any single experimental setup. Lab prediction tasks, casino surveillance, lottery purchase records, and administrative decision data all show the same pattern. A finding that appears across five very different measurement methods is much harder to dismiss as a methodological artifact of any one of them.
Second, the lab-and-field convergence is unusually clean. The Tune 1964 lab work predicted exactly the pattern Croson and Sundali found in the casino, which predicted exactly the pattern Chen-Moskowitz-Shue found in administrative decision data. When the lab and the field tell the same story about a behavioral phenomenon, the phenomenon is real.
Third, the mechanism is well-specified. The Tversky-Kahneman representativeness-heuristic framing makes a specific prediction about when the bias should appear (perceived independent processes), when it should not (perceived skill-based processes, where the hot-hand expectation takes over), and how it should scale with sequence length and stakes. All of those predictions have been tested and largely confirmed.
Fourth, the real-world consequences are large. Chen-Moskowitz-Shue documented gambler’s-fallacy distortions in decisions about asylum, lending, and umpiring that affect millions of people. This is not a curiosity about how undergraduates respond to coin-flip prompts. It is a measurable distortion in the operation of major social institutions, and it shows up wherever sequential decisions are made.
Fifth, the debunking attempts have not landed. After Miller-Sanjurjo 2018, sophisticated readers (correctly) asked whether the gambler’s fallacy was on similar ground. The honest answer is that the methodological situation is fundamentally different --- the hot-hand fallacy literature had a specific statistical bug, and the gambler’s fallacy literature does not. There has been no analogous “we ran the test wrong for thirty years” finding for the gambler’s fallacy, and after a decade of post-2015 scrutiny, none appears imminent.
The contrast with most of the rest of this hub is useful. Power posing, ego depletion, money priming, marshmallow tests, the bystander effect: these all fail on at least one of the five criteria above. They depend on single paradigms, do not transfer from lab to field, have under-specified mechanisms, have small real-world consequences, and have been actively debunked. The gambler’s fallacy is the opposite case across all five dimensions. It is what a robust cognitive bias actually looks like.
What This Means For Strategists Designing Sequential-Decision Processes
For an executive evaluating the operational implication, the message is calibrated and direct.
You do not need to worry about every behavioral economics finding that gets cited in a McKinsey memo or a Thaler-Sunstein-style nudge unit recommendation. Most of those findings are weak or have failed to replicate, and the field is currently in a period of healthy self-correction about which of them to keep. The intelligent stance is skepticism by default, with the burden of proof on advocates to show the finding survives serious scrutiny.
You do need to worry about the gambler’s fallacy in any process your organization runs where decisions are made in sequence on items whose underlying merits are statistically independent. The list of such processes in a typical company is longer than people initially realize:
- Hiring panels evaluating candidates in a single day of interviews.
- Deal-approval committees reviewing pipelines in batches.
- Promotion committees voting on candidates in a fixed order.
- Investment-committee decisions on consecutive deals.
- Loan, underwriting, and credit-decision queues.
- Quality-control and defect-detection inspection lines.
- Code-review queues where reviewers see PRs in sequence.
- Editorial decisions at media organizations where editors review submissions in order.
- Content moderation at platforms where moderators see flagged items in sequence.
In every one of these processes, the Chen-Moskowitz-Shue evidence implies that some fraction of decisions are being made on negative-autocorrelation grounds that have nothing to do with the merits of the case. The fraction is small per-decision (probably 1—5 percentage points), but it compounds across decisions and disproportionately affects cases that come immediately after streaks. For a hiring panel that has approved four candidates in a row, the fifth candidate is facing a meaningful systematic headwind. For a deal-approval board that has approved three deals in a row, the fourth deal is facing the same headwind. None of this is conscious. None of it can be fixed by telling the panel to be aware of it. All of it can be partially mitigated by structural interventions: randomizing the order of evaluation, separating consideration from decision (deliberating only after all candidates have been evaluated), and providing explicit base-rate anchors.
The intervention is cheap. The evidence is strong. The bias is real. Treat sequential-decision design as a calibration problem worth solving, and you will systematically improve the quality of decisions in any process where the volume is high enough for the bias to matter.
Sources
- Tune, G. S. (1964). Response preferences: A review of some relevant literature. Psychological Bulletin, 61(4), 286—302. DOI: 10.1037/h0048618.
- Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105—110. DOI: 10.1037/h0031322.
- Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124—1131. DOI: 10.1126/science.185.4157.1124.
- Clotfelter, C. T., & Cook, P. J. (1993). The “gambler’s fallacy” in lottery play. Management Science, 39(12), 1521—1525. DOI: 10.1287/mnsc.39.12.1521. (Earlier circulated as NBER Working Paper 3769, 1991.)
- Burns, B. D., & Corpus, B. (2004). Randomness and inductions from streaks: “Gambler’s fallacy” versus “hot hand.” Psychonomic Bulletin & Review, 11(1), 179—184. DOI: 10.3758/BF03206480.
- Ayton, P., & Fischer, I. (2004). The hot hand fallacy and the gambler’s fallacy: Two faces of subjective randomness? Memory & Cognition, 32(8), 1369—1378. DOI: 10.3758/BF03206327.
- Croson, R., & Sundali, J. (2005). The gambler’s fallacy and the hot hand: Empirical data from casinos. Journal of Risk and Uncertainty, 30(3), 195—209. DOI: 10.1007/s11166-005-1153-2.
- Chen, D. L., Moskowitz, T. J., & Shue, K. (2016). Decision making under the gambler’s fallacy: Evidence from asylum judges, loan officers, and baseball umpires. Quarterly Journal of Economics, 131(3), 1181—1242. DOI: 10.1093/qje/qjw017.
Related
- Replication Crisis Hub --- the index of the full series.
- The Hot Hand Fallacy: The “Cognitive Bias” That Two Statisticians Reversed In 2018 --- the companion piece on the related phenomenon that did not survive scrutiny.
- The Default Effect: The Behavioral-Economics Finding That Actually Holds Up --- the other anti-example article in the hub.
- The Availability Heuristic --- a related representativeness-program finding.
- Confirmation Bias --- another well-replicated cognitive bias.
- The Halo Effect --- a different category of decision distortion that also affects sequential evaluation.
FAQ
Is the gambler’s fallacy related to the hot hand fallacy?
They are theoretically related --- both involve expectations about sequences --- but empirically asymmetric. The gambler’s fallacy is the expectation that streaks reverse in independent random processes; the hot-hand expectation is the expectation that streaks continue in skill-based processes. People apply each to different perceived data-generating processes (Burns & Corpus 2004; Ayton & Fischer 2004). The gambler’s fallacy is robustly supported by lab and field evidence. The hot hand “fallacy” --- the academic claim that basketball streaks were a cognitive illusion --- was reversed in Miller & Sanjurjo 2018 because the standard test had a methodological bias. The hot hand intuition turns out to be more correct than the academic literature claimed.
Has the gambler’s fallacy survived the replication crisis?
Yes. It is one of the cleanest examples of a robust cognitive bias --- measured across lab tasks, casino video, lottery records, and administrative decision data. The mechanism (representativeness, law of small numbers) is well-specified and the field-experiment record is consistent. There is no analog of the Miller-Sanjurjo statistical correction that would impeach the gambler’s fallacy literature.
How do I de-bias my hiring panel?
Three structural interventions: (1) randomize the order in which candidates are evaluated, (2) separate evaluation from decision --- run all interviews before any deliberation, so the panel does not develop a running tally during the process, and (3) provide explicit base-rate reminders (“we typically hire 1 in 4 candidates from this pipeline”) to anchor the panel to long-run rates rather than the local running average. Telling the panel to “be aware of the gambler’s fallacy” does not work; the bias operates below the threshold of explicit reasoning.
What about lottery players? Should they avoid recently-drawn numbers?
The mathematics says past draws are informationally irrelevant --- the expected value of any number is identical. The economics has a wrinkle: because lottery payouts split among winners, betting on unpopular numbers (which, ironically, are the recently-drawn numbers that gambler’s-fallacy bettors avoid) has marginally higher expected payout, because you would share any prize with fewer co-winners. Clotfelter & Cook 1991 found the avoidance is large enough to produce excess returns from betting on recently-drawn numbers. So if you play, the rational strategy is the opposite of what the gambler’s-fallacy intuition recommends.
What about superstitions in business --- “we just signed three deals, the fourth is bound to fall through”?
That is exactly the gambler’s fallacy applied to a business pipeline. There is no mathematical reason for the fourth deal in a string to be less likely to close than the first, assuming the underlying deal merits are independent (which, in most pipelines, they roughly are). The expectation that “we are due for a loss” is a textbook gambler’s-fallacy expectation, and acting on it (e.g., by under-investing in closing effort on the fourth deal) will produce real costs.
Does the bias affect quantitative analysts and people with statistical training?
Yes, less than untrained populations but more than zero. Tversky & Kahneman 1971 specifically demonstrated that trained research psychologists --- people who teach probability for a living --- fall into the same trap when the question is posed in a way that engages intuition rather than formal reasoning. The bias is not eliminated by knowing about it. It can be partially mitigated by structural interventions that bypass the intuitive judgment.
What is the most important paper to read if I only read one?
Chen, Moskowitz, & Shue 2016 in the Quarterly Journal of Economics. It is the highest-stakes, largest-N, most-policy-relevant field demonstration of the gambler’s fallacy in the literature, and the implications for any organization that runs sequential decision processes are direct and actionable.
Where else might the gambler’s fallacy show up that I should worry about?
Any high-volume sequential decision process: hiring, deal approval, promotion votes, loan underwriting, credit decisions, code review, content moderation, editorial selection, quality-control inspection, asset-allocation rebalancing decisions, and committee voting on grants or proposals. If decisions arrive one after another and the underlying merits of consecutive items are independent, the bias is operating somewhere in the process. The mitigation menu (randomize order, separate evaluation from decision, anchor to base rates) is the same across all of them.