Most behavioral-science findings in this hub did not survive scrutiny. The default effect did. Across decades of replication, across countries, across decision domains --- automatic enrollment and opt-out architecture remain the single most reliable nudge in the behavioral toolkit. Here is why this one is different.

If you have been reading through this hub, you have watched a long parade of canonical behavioral-science findings get dismantled. Power posing did not survive Carney’s own recantation. Ego depletion collapsed under Hagger 2016. Money priming evaporated in preregistered replications. Bargh’s elderly-walking study, the marshmallow test, the bystander effect’s Kitty Genovese mythology, the entire family of social-priming results --- one after another, the most-cited demonstrations of “this is how human behavior works” have either failed to replicate or shrunk to something much smaller than their original claim.

A rational reader by now might conclude that all of behavioral economics is suspect. That conclusion would be wrong, and this article exists to explain why.

Because in the same period that produced all those replication failures, one finding kept holding up. It held up across countries with completely different cultures. It held up across decision domains as different as retirement saving, organ donation, energy provider choice, school enrollment, and end-of-life directives. It held up in field experiments with sample sizes in the hundreds of thousands, not the convenience samples of fifty undergraduates that broke so much of the literature. It held up when meta-analysts adjusted for publication bias --- the same adjustment that demolished other nudge findings. And the magnitude of the effect was not the small-d-equals-0.2 polite cough that haunts most of social psychology. The magnitude was, in some applications, the difference between a system that worked and a system that did not.

That finding is the default effect --- the empirical observation that whatever option a chooser has been pre-assigned tends to be the option they end up with, at rates far higher than active preference would predict. Status quo bias, in the Samuelson-Zeckhauser framing. Inertia, in the Madrian-Shea framing. Default architecture, in the Thaler-Sunstein nudge-theory framing.

This is the anti-example article in a hub full of takedowns. It exists for three reasons. First, calibration --- readers should leave the hub knowing that “behavioral science is mostly broken” is wrong; the more accurate claim is “behavioral science has produced a small number of robust, large, mechanism-grounded findings, and a much larger number of fragile, small, contextually fragile findings, and the field’s main failure was treating those two categories as if they were the same.” Second, decision-usefulness --- for an executive evaluating which behavioral interventions to actually deploy, defaults are the single safest bet in the catalog, and that is worth saying explicitly. And third, intellectual honesty --- if you spend a hub criticizing behavioral economics, you owe readers the parts that worked.

So here is the case for the default effect, as honest as I can make it, including the legitimate critiques.

Samuelson and Zeckhauser 1988 --- The Foundational Paper

The default effect did not begin with Thaler and Sunstein in 2008, and it did not begin with Madrian and Shea in 2001. It began with a paper that almost nobody outside academic decision science has ever read: Samuelson, W., & Zeckhauser, R. (1988). “Status Quo Bias in Decision Making.” Journal of Risk and Uncertainty, 1(1), 7—59. DOI: 10.1007/BF00055564.

Samuelson and Zeckhauser ran a series of experiments in which subjects were asked to make a choice --- between investment portfolios, between insurance policies, between political candidates, between job offers. The same choice was presented to different groups, but with a twist. Some groups were told they were starting from scratch (“you have just inherited a large sum of money and must decide how to invest it”). Other groups were told that one of the options was already their current state (“you have inherited a portfolio invested in moderate-risk stocks and must decide whether to keep it or switch”).

The substantive options were identical across conditions. The only difference was which option had been labeled the status quo. And in essentially every domain Samuelson and Zeckhauser tested, the option that had been labeled the status quo gained substantial market share --- somewhere between roughly 20 and 40 percentage points, depending on the scenario --- over the same option when it had not been labeled the status quo.

They followed up the lab work with an analysis of real-world health-insurance choices among Harvard employees, who had recently been given a new menu of plan options. Long-tenured employees, who had been on the old plan for years, stuck with it at rates dramatically higher than newly hired employees making the same choice for the first time. The new plans were objectively better for many of these long-tenured employees on the specific dimensions they cared about. They stayed anyway. Inertia, not active preference, was driving the choice.

Samuelson and Zeckhauser’s contribution was twofold. First, they documented the phenomenon across many domains and showed it was not a quirk of any one decision class. Second, they catalogued the candidate mechanisms --- loss aversion, regret avoidance, cognitive cost of deliberation, the implicit endorsement that a status quo can carry --- and noted that the same phenomenon could plausibly arise from any of them. This second point matters because it is part of why the default effect survives: there is not one mechanism that has to be correct for the phenomenon to appear; there are several plausible mechanisms, any of which would predict it.

The paper is from 1988. The findings have not been retracted, the methodology has not been challenged in the way that, say, the elderly-priming methodology was challenged, and the basic phenomenon they identified is exactly the one that subsequent field experiments would document at scale.

Madrian and Shea 2001 --- The 401(k) Auto-Enrollment Study

The paper that turned status-quo bias from an academic curiosity into a major policy tool is Madrian, B. C., & Shea, D. F. (2001). “The Power of Suggestion: Inertia in 401(k) Participation and Savings Behavior.” Quarterly Journal of Economics, 116(4), 1149—1187. DOI: 10.1162/003355301753265543.

Madrian and Shea had access to administrative data from a large U.S. company that, in mid-1998, switched its 401(k) enrollment policy. Before the change, new hires had to actively elect to participate in the company’s 401(k) plan --- fill out a form, choose a contribution rate, choose an asset allocation. After the change, new hires were automatically enrolled at a default contribution rate (3% of pay) and a default asset allocation (a money-market fund), with full freedom to opt out at any time, change their contribution rate, or change their allocation.

The substantive choice architecture was identical. Every employee had complete freedom to enroll or not, to contribute whatever percentage they wanted, and to invest in any of the available funds. The only thing that changed was the default state.

The participation rates moved by an amount that was, at the time, almost embarrassing for the assumption that retirement-savings decisions reflect deliberate preference. Under the old opt-in regime, roughly 37% of new hires were enrolled in the 401(k) by their first six months on the job. Under the new auto-enrollment regime, that number jumped to roughly 86%. A change in the default state, with no change in the underlying options, moved participation by approximately fifty percentage points.

And the effect persisted. Two and three years out, the auto-enrolled cohorts still showed much higher participation than the opt-in cohorts. The contribution-rate default and the asset-allocation default also stuck --- most auto-enrolled employees never moved off the 3% default contribution or the money-market default investment, even though both were almost certainly suboptimal for their long-run wealth (the money-market default in particular meant under-exposure to equities for workers in their twenties and thirties).

The Madrian-Shea result has been replicated repeatedly. Choi, Laibson, Madrian, and Metrick produced a series of follow-up studies through the 2000s confirming the same pattern at additional firms and refining the understanding of when defaults stick versus when employees actively override them. The Pension Protection Act of 2006 codified auto-enrollment as a preferred plan design, and subsequent industry data has shown that participation rates in auto-enroll plans run reliably 25 to 50 percentage points above participation rates in opt-in plans. The original 2001 numbers were not a fluke of one firm or one cohort --- they reflected a phenomenon that has shown up at every employer that has run the natural experiment.

There is no replication crisis in the auto-enrollment literature. The effect was large, has been measured at scale repeatedly, and has held up.

Johnson and Goldstein 2003 --- The Organ Donation Comparison

The default effect’s most cited cross-national illustration is Johnson, E. J., & Goldstein, D. (2003). “Do Defaults Save Lives?” Science, 302(5649), 1338—1339. DOI: 10.1126/science.1091721.

Johnson and Goldstein observed that European countries fell into two roughly drawn buckets on organ-donation policy. Some countries --- Germany, Denmark, the Netherlands, the U.K. at the time --- used an “explicit consent” regime: citizens were not organ donors by default; they had to actively register as donors to have their organs eligible for transplantation. Other countries --- Austria, Belgium, France, Hungary, Poland, Portugal, Sweden --- used a “presumed consent” regime: citizens were donors by default; they had to actively register as non-donors to opt out.

The published donor-consent rates in the two groups were extraordinarily different. Explicit-consent countries clustered at consent rates in the low double digits --- Germany at roughly 12%, Denmark at 4%, the Netherlands at 28%, the U.K. at 17%. Presumed-consent countries clustered near ceiling --- Austria at roughly 99.98%, Belgium at 98%, France at 99.9%, Hungary at 99.97%, Poland at 99.5%, Portugal at 99.6%, Sweden at 86%. The gap is not subtle. It is approximately the entire range of the variable.

Johnson and Goldstein paired the cross-national observation with a lab experiment. They asked U.S. participants to imagine they had just moved to a new state where the organ-donor default was either “yes” or “no,” and asked them whether they would keep the default or change it. The opt-in condition produced about a 42% donor rate; the opt-out condition produced about an 82% donor rate; a forced-choice neutral condition produced about a 79% donor rate. Same population, same question, same outcome variable. The only thing that moved was the default state, and consent rates roughly doubled.

The 2003 paper has been the subject of substantial follow-up scrutiny, and a couple of nuances are worth flagging honestly. First, the consent-rate gap between presumed-consent and explicit-consent countries is not entirely a default effect --- presumed-consent regimes often coexist with stronger family-consultation norms, more aggressive donor-coordination infrastructure, and different cultural attitudes toward death and the body. Some of the cross-national gap reflects those structural and cultural differences rather than the registry default per se. Second, the gap between registered consent rates and actual transplantation rates is much smaller than the registered-consent gap, because in many presumed-consent countries family members can still effectively veto the donation at the hospital. The “default effect saves lives” framing has been challenged for overstating the direct policy implication.

But the underlying behavioral observation --- that the default option captures most chooser-share, in this domain, at very large magnitudes --- has not been challenged. The 42% versus 82% gap in the U.S. lab experiment did not have any cultural-norm confound; it was a within-population manipulation, and it produced one of the largest default effects ever documented. The basic phenomenon is real.

Jachimowicz 2019 --- What 20+ Years of Subsequent Research Showed

By the late 2010s the default-effect literature had grown to several hundred studies across pensions, organ donation, energy provider choice, health-insurance enrollment, end-of-life directives, education choice, environmental opt-ins, e-mail privacy settings, and many other domains. The natural question was whether the headline findings from the founding papers were representative or whether they were just the loud outliers.

Jachimowicz, J. M., Duncan, S., Weber, E. U., & Johnson, E. J. (2019). “When and Why Defaults Influence Decisions: A Meta-Analysis of Default Effects.” Behavioural Public Policy, 3(2), 159—186. DOI: 10.1017/bpp.2018.43 is the canonical answer.

They aggregated 58 studies covering 73 default manipulations across all the major application areas. The overall pooled effect size was Cohen’s d = 0.68, which in behavioral-science terms is large --- substantially larger than the typical effect size for nudges in other categories (most other nudges cluster around d = 0.20 to d = 0.40), and large enough to be operationally meaningful at field-experiment scale. The 95% confidence interval excluded zero by a wide margin.

More usefully for practitioners, the meta-analysis identified the conditions under which defaults work best:

Endorsement effects. Defaults work better when the chooser interprets the default as conveying information about what the choice architect thinks they should pick. This is partly why employer 401(k) defaults work so powerfully --- employees plausibly read “the company set this as the default” as a soft recommendation. It is also why defaults work worse in contexts where the chooser distrusts the choice architect; defaults set by a company the consumer perceives as adversarial work less reliably.

Endowment effects. Defaults work better when the chooser perceives the default as a reflection of their current state --- something they would have to actively give up to deviate from. The status-quo framing of the original Samuelson-Zeckhauser work captures this. This is why “you are already enrolled, opt out by clicking here” beats “we recommend you enroll, click here to do so” even when the active option is identical.

Effort. Defaults work better when the cost of overriding them is non-trivial. A default that requires a phone call to override sticks more than a default that can be flipped with one click. This is mechanically obvious but matters operationally because it means defaults degrade in effectiveness as choice-architecture interfaces get more user-friendly.

Stake size and decision time. Counterintuitively, defaults do not necessarily degrade for high-stakes decisions --- they sometimes intensify, because choosers facing complex high-stakes choices are more likely to fall back on the default to avoid the cognitive cost of full evaluation. But defaults do degrade for choosers who have substantial decision time and substantial pre-existing preference; an investor who has actively researched their asset allocation will override a 401(k) default at much higher rates than an investor who has not.

The Jachimowicz meta-analysis is the single best reference for the magnitude and conditions of the default effect. The pooled d = 0.68 is the number to remember if you remember nothing else from this article.

The Mertens 2022 vs. Maier 2022 Debate

A reasonable reader might at this point ask: yes, but didn’t a recent meta-analysis collapse most of nudge theory? And the answer is yes --- sort of --- but the default effect is exactly what survived.

The relevant papers are Mertens, S., Herberz, M., Hahnel, U. J., & Brosch, T. (2022). “The Effectiveness of Nudging: A Meta-Analysis of Choice Architecture Interventions Across Behavioral Domains.” PNAS, 119(1), e2107346118. DOI: 10.1073/pnas.2107346118, and the response Maier, M., Bartoš, F., Stanley, T. D., Shanks, D. R., Harris, A. J. L., & Wagenmakers, E.-J. (2022). “No Evidence for Nudging After Adjusting for Publication Bias.” PNAS, 119(31), e2200300119. DOI: 10.1073/pnas.2200300119.

Mertens et al. pooled 212 studies of choice-architecture interventions across all categories --- defaults, social norms, framing, decoy effects, salience, simplification --- and reported an overall effect size of d = 0.43, which they characterized as “small to medium” and as supporting the general usefulness of nudging.

Maier et al. responded with a publication-bias-corrected re-analysis using selection models and PET-PEESE adjustment. Their result was uncomfortable: once you accounted for the fact that studies finding nudge effects are much more likely to be published than studies finding null nudge effects, the average estimated effect of nudging across the Mertens database fell to essentially zero. The Mertens 0.43 number, on Maier’s analysis, was an artifact of which studies the literature had chosen to publish.

This is the kind of exchange that has demolished the credibility of a lot of behavioral interventions. But here is what matters for our purposes: when Maier and colleagues broke down the categories, defaults were one of the few intervention types that survived publication-bias correction with a still-meaningful effect size. Other nudge categories --- social-comparison messaging, framing manipulations, salience cues --- collapsed under bias correction. Defaults degraded less. This is consistent with the Jachimowicz finding that defaults have the largest unadjusted effect size of any nudge category, and consistent with the field-experiment record showing that auto-enrollment effects on participation rates run in the 25—50 percentage-point range across dozens of replications.

Mertens et al. also issued a published correction in 2022 acknowledging coding errors and the inclusion of data from a retracted paper. This was, deservedly, taken as embarrassing. It does not change the bottom-line picture, though, which is the same picture you would have gotten if you had simply read Jachimowicz 2019: the default effect specifically is robust, even when the broader nudge literature is not.

So the honest summary is: nudge theory as a generic claim (“small behavioral tweaks reliably move behavior in welfare-relevant directions”) is on much shakier ground than the popular literature implies. But defaults are the load-bearing exception. They are the part of the nudge framework that holds up to bias correction, holds up to replication, holds up to cross-cultural test, and holds up at field-experiment scale.

Why The Default Effect Replicates When So Many Other Findings Don’t

Stepping back from the specific studies, it is worth asking the meta-question: what is different about defaults that makes them survive scrutiny when so many other behavioral-economics findings don’t?

I think there are four reasons, and they are useful as a diagnostic checklist for evaluating any other behavioral finding you might be considering deploying.

The mechanism is concrete and over-determined. Power posing was supposed to work via testosterone and cortisol changes; the testosterone-cortisol mechanism turned out not to be real, which removed the only proposed pathway. Money priming was supposed to work via some unspecified spreading-activation mechanism; when the effect itself failed to replicate, the mechanism story had no independent support. Defaults, by contrast, are over-determined by multiple plausible mechanisms --- effort cost, loss aversion, regret avoidance, implicit endorsement, signaling --- any one of which would predict the observed phenomenon. You can attack any single mechanism story without dismantling the prediction. That over-determination is theoretical resilience.

The conditions of application are well-specified. Defaults work better when endorsement, endowment, and effort conditions are all met; they work worse when they are not. The Jachimowicz meta-analysis lays this out explicitly. By contrast, many of the failed findings --- power posing, ego depletion, money priming --- never developed a clear conditions-of-application story. They were claimed to work generically, which means they had no way to explain failed replications other than “the replicators did it wrong.”

The effect is large enough to detect reliably. A 50-percentage-point shift in 401(k) participation is not an effect that requires a finely tuned experimental paradigm to demonstrate. You can see it in administrative records. Most of the failed findings of the replication crisis were small-effect findings --- d = 0.20 or so --- that required large samples to detect reliably and that, with bias-corrected re-analysis, turned out to be even smaller than that. Defaults are not in that fragile-effect-size range; they are in the obvious-from-the-data range.

The application is high-stakes enough that people have actually run rigorous field experiments. The reason we know so much about defaults is that organizations with billions of dollars on the line --- pension administrators, governments, energy regulators --- have run real field experiments at scale, with administrative-data outcome measurement, with proper control groups. This is not the convenience-sample undergraduate-lab work that produced so many of the failed findings. The default effect has been measured in environments where the measurement methodology is sound enough that there is not much room for the effect to be an artifact.

If you are evaluating any other behavioral-science claim for whether it is likely to hold up, run it against this checklist. Is the mechanism over-determined? Are the conditions of application well-specified? Is the effect large enough to detect without statistical heroics? Has it been measured in high-stakes field experiments with administrative-data outcomes, not lab convenience samples? If yes to all four, you have a candidate for a robust finding. If no to all four, you have a candidate for the next replication-crisis casualty.

What This Means For Strategists

The practical takeaways for someone making real decisions about behavioral interventions are:

Defaults are the highest-confidence behavioral intervention you can deploy. If you have a choice architecture problem --- onboarding, enrollment, configuration, consent, subscription, plan selection --- the question of what to set as the default is the single most consequential design decision you will make, and the evidence base for the consequences of that decision is much stronger than the evidence base for most other things behavioral-economics consultants will sell you.

Pick the default that maximizes long-run welfare for the median user, not the default that maximizes your short-run capture. This is the libertarian-paternalism criterion from Thaler and Sunstein. It is also the criterion that survives ethics review and avoids the consumer-protection backlash that has hit dark-pattern defaults in the U.S. and especially the EU. If your default subscription renewal would not be the option a well-informed user would actively choose, you are not exploiting a behavioral nudge --- you are setting yourself up for an FTC enforcement action.

Communicate the default explicitly. The Jachimowicz endorsement-effect finding implies that disclosed defaults work better than hidden ones. “You are enrolled at a 6% contribution rate, which is what most employees at your tenure and salary level select; you can change it at any time” outperforms a silent default both ethically and operationally. The endorsement-signal mechanism is doing real work.

Do not use defaults for fundamentally heterogeneous populations facing high-stakes choices that they actually have time to evaluate. Defaults work best for low-attention, low-information, low-stakes decisions --- or for choices where the choice architect’s preferred option is genuinely the best option for most people. For decisions where individual circumstances matter substantially and where users have the time and capacity to deliberate, active-choice architectures (where users must choose, but no option is pre-selected) often produce better welfare outcomes than defaults, with similar participation rates.

Update your conditional probability on other behavioral interventions downward. If defaults are the one robust thing, then claims about social proof, scarcity, anchoring, loss aversion in marketing contexts, and other nudge-adjacent interventions deserve substantially more skepticism than the consulting-deck version implies. Most of those have not had the field-experiment-with-administrative-outcomes treatment that defaults have had, and most of the lab work supporting them has the same fragility profile as the studies in this hub that failed to replicate.

What This Anti-Example Tells Us About Behavioral Science Overall

The replication crisis is real. The catalog of canonical-then-collapsed findings in this hub is long, and there are more to come. A reasonable executive could read all of that and conclude that behavioral economics is mostly an academic vanity project --- interesting theory, weak evidence, low practical reliability.

That conclusion would be wrong, and the default effect is the best counterexample.

What behavioral science can produce, when it does the work properly, is something genuinely useful: well-conducted, theoretically grounded, replicable findings that have measurably improved policy outcomes at population scale. Auto-enrollment in retirement savings has, on conservative estimates, increased aggregate U.S. retirement wealth by hundreds of billions of dollars relative to the opt-in counterfactual. Organ-donor defaults --- whatever the legitimate caveats about how much of the cross-national gap reflects defaults versus other factors --- have saved lives. The Behavioural Insights Team in the U.K. has produced default-based interventions in tax compliance, energy use, and education enrollment that have outperformed control conditions in field experiments at scale.

The replication crisis did not invalidate behavioral economics. What it invalidated was the version of behavioral economics that treated every clever lab demonstration as a robust generalizable finding. The version that survives is more modest, more conditional, more grounded in field experiments with administrative-data outcomes, and more humble about which interventions actually work versus which are theoretically appealing but empirically fragile.

This is, in fact, what a healthy science looks like. A field that has done the painful self-correction of the last fifteen years and emerged with a smaller, sturdier, more reliable empirical core is in better shape than a field that never did the correction and is still operating on inflated claims. The hub you are reading is a guided tour of the corrections. The default effect is what is left standing afterward.

For an executive choosing where to invest scarce attention on behavioral interventions, the prioritization implied by all of this is: spend disproportionate attention on default architecture in your product and policy decisions, because the evidence is unusually strong; spend much less attention on the long tail of small-effect nudges that get pitched in consulting decks, because most of them are either fragile or already collapsed; and approach any new behavioral claim with the four-question checklist above before you build anything on top of it.

That is the calibration this hub is meant to deliver. The default effect is the proof that calibration is possible.

Sources

  • Samuelson, W., & Zeckhauser, R. (1988). Status quo bias in decision making. Journal of Risk and Uncertainty, 1(1), 7—59. DOI: 10.1007/BF00055564
  • Madrian, B. C., & Shea, D. F. (2001). The power of suggestion: Inertia in 401(k) participation and savings behavior. Quarterly Journal of Economics, 116(4), 1149—1187. DOI: 10.1162/003355301753265543
  • Johnson, E. J., & Goldstein, D. (2003). Do defaults save lives? Science, 302(5649), 1338—1339. DOI: 10.1126/science.1091721
  • Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improving Decisions About Health, Wealth, and Happiness. Yale University Press.
  • Jachimowicz, J. M., Duncan, S., Weber, E. U., & Johnson, E. J. (2019). When and why defaults influence decisions: A meta-analysis of default effects. Behavioural Public Policy, 3(2), 159—186. DOI: 10.1017/bpp.2018.43
  • Mertens, S., Herberz, M., Hahnel, U. J. J., & Brosch, T. (2022). The effectiveness of nudging: A meta-analysis of choice architecture interventions across behavioral domains. PNAS, 119(1), e2107346118. DOI: 10.1073/pnas.2107346118
  • Maier, M., Bartoš, F., Stanley, T. D., Shanks, D. R., Harris, A. J. L., & Wagenmakers, E.-J. (2022). No evidence for nudging after adjusting for publication bias. PNAS, 119(31), e2200300119. DOI: 10.1073/pnas.2200300119
  • Choi, J. J., Laibson, D., Madrian, B. C., & Metrick, A. (2004). For better or for worse: Default effects and 401(k) savings behavior. In D. A. Wise (Ed.), Perspectives on the Economics of Aging (pp. 81—126). University of Chicago Press.

Browse the full Replication Crisis Hub for other behavioral-science findings, including:

FAQ

If most behavioral-science findings are weak, why does this one work?

Four reasons: the underlying mechanism is over-determined (effort cost, loss aversion, endorsement, regret avoidance --- any one would predict the effect); the conditions of application are well-specified, so failed replications can be diagnosed rather than dismissed; the effect size is large enough to detect without statistical heroics; and the application has been studied with high-stakes field experiments using administrative-data outcomes rather than convenience-sample lab work. Most failed behavioral findings violate at least three of these conditions.

Are defaults manipulative?

They can be. The ethical question is whether the default is set to maximize chooser welfare (libertarian paternalism, in Thaler and Sunstein’s framing) or to maximize the choice architect’s capture at the chooser’s expense (a dark pattern, in the consumer-protection-law framing). A default to enroll in a retirement plan with full opt-out is generally welfare-improving for the median chooser. A default to auto-renew a paid subscription with deliberately friction-loaded cancellation is generally welfare-extractive. The default effect itself is morally neutral; the deployment determines the ethics.

What’s the right default to set?

The decision rule, from the libertarian-paternalism literature, is to set the default that a well-informed user with adequate time to deliberate would actively choose for themselves. For 401(k) enrollment, that is participation at a contribution rate consistent with adequate retirement saving. For organ donation, it is presumed consent. For subscription renewal, it is the option a user would pick if they were re-deciding from scratch with full price visibility. Where that question genuinely has no single answer because the population is heterogeneous, an active-choice architecture (no pre-selected option, but choice required) often outperforms any default.

What about libertarian-paternalism critiques?

The strongest critique is from political theorists and choice-autonomy advocates who argue that defaults, even welfare-enhancing ones, undermine the deliberation that makes autonomous choice meaningful. There is something to this --- a society that auto-enrolls citizens into all welfare-maximizing options is also a society that has done some of their thinking for them. The Thaler-Sunstein response is that the relevant counterfactual is not “active deliberation” but “the default someone else picks for you anyway” (a corporate or governmental default of inaction), so the question is who picks the default, not whether one exists. That response is contested but coherent.

How big is the default effect in practice?

For 401(k) auto-enrollment, the participation gap is typically 25—50 percentage points (Madrian-Shea found 86% vs. 37%). For organ-donor consent rates, the within-population (lab) gap is roughly 40 percentage points; the between-country gap is much larger but confounded by cultural and infrastructural factors. For general nudge interventions, Jachimowicz 2019 reports a pooled Cohen’s d of 0.68 for defaults specifically --- large by behavioral-science standards.

Does the default effect work in B2B and enterprise sales?

The evidence base in B2B is thinner, but the conditions that make defaults work (endorsement, endowment, effort cost) generally apply. Enterprise procurement decisions often have a strong status-quo bias --- buyers stick with incumbent vendors at rates that exceed active-comparison preference, which is consistent with status-quo bias as documented in Samuelson-Zeckhauser. The implication for challenger vendors is that the cost of dislodging an incumbent is higher than feature-comparison would predict, because the incumbent has the status-quo default.

Why didn’t the Mertens 2022 critique destroy default research the way it weakened other nudges?

When Maier and colleagues applied publication-bias correction to the Mertens database, most nudge categories collapsed but defaults retained a meaningful effect size. This is consistent with the Jachimowicz finding that defaults have the largest unadjusted effect size of any nudge category --- a large true effect degrades less under bias correction than a small one. It is also consistent with the field-experiment record, which has documented default effects at scale outside the publication-biased academic literature.

What’s the single best deployment example for a startup or growth team?

Onboarding configuration. Whatever settings, integrations, notifications, or feature toggles your product offers, the defaults you ship will be the settings most users have a year later. Treat that decision with the seriousness the evidence implies. Set the defaults that produce the best long-run user outcome and the highest retention, and disclose them clearly. This single design decision will, in most products, outperform any other behavioral intervention you could layer on top.

replication-crisis behavioral-economics defaults nudge-theory evidence-evaluation

Free Tool

Built for Experimentation Teams

GrowthLayer is the experimentation platform I built for CRO teams --- test management, AI-powered insights, and pattern recognition across your entire program.

Explore GrowthLayer → (opens in new tab)

· Start Free →

Share this article

LinkedIn (opens in new tab) X / Twitter (opens in new tab)

Copy link

Go deeper

Methodology The PRISM Method Case Studies $30M+ in Results Work Together Services & Mentoring

Experimentation and growth leader. Builds AI-powered tools, runs conversion programs, and writes about economics, behavioral science, and shipping faster.

About LinkedIn Newsletter

← Previous

Broken Windows Theory: The Atlantic Essay That Reshaped Policing On Weak Evidence

Next →

The Power Of “Because”: Langer’s Copy-Machine Study, Honestly Read

Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified behavioral economist (1 of ~1,000 worldwide). 200+ A/B tests across energy, SaaS, fintech, e-commerce, and marketplace verticals.