Most behavioral findings in this hub collapsed under scrutiny. Tversky and Kahneman 1981’s Asian disease problem did not. Forty-plus years of replication, two meta-analyses, and a Many Labs result of 31 of 36 labs hitting significance --- framing is what robust applied cognitive psychology actually looks like.
If you have been reading through this hub, you have watched canonical demonstrations of “this is how the mind works” get dismantled one after another. Ego depletion failed in Hagger 2016’s preregistered multi-lab replication. Power posing collapsed when its first author publicly retracted her endorsement. Money priming evaporated when it left the original lab. The Asch conformity rate shrank by half. Implicit bias, on the modern read, predicts almost nothing about individual behavior. Bem’s precognition study should have ended the file-drawer era a decade earlier than it did. By the time you reach this article you might reasonably suspect that all of cognitive psychology is suspect, that prospect theory is just the latest cathedral built on sand, and that anything sold as a “behavioral finding” deserves the same withering skepticism the rest of the hub applies elsewhere.
That suspicion is the wrong update.
Because in the middle of all that wreckage, one of cognitive psychology’s most-cited demonstrations of irrational choice --- a result first published in 1981 with 152 undergraduates and a single hypothetical disease vignette --- kept holding up. It held up when researchers in the 1990s tried to discredit it as a linguistic artifact. It held up in Kühberger’s 1998 meta-analysis of 136 papers and roughly 30,000 participants. It held up in a 2018 publication-bias-corrected re-appraisal that pushed the effect size up, not down. It held up in a Many Labs replication effort that hit statistical significance in 31 of 36 independent laboratories. The mechanism is built into the kink at the reference point in prospect theory’s value function, and that value function has been independently estimated from gambling data, household financial decisions, taxi-driver labor supply, professional golfer putt-effort calibration, and a long list of field-experimental settings that have no obvious connection to a hypothetical Asian disease.
That finding is the framing effect --- the observation that mathematically identical choice problems, presented as gains versus losses, systematically pull choosers toward different decisions. This is the second anti-example in the hub, after defaults. It exists for the same three reasons: calibration (so readers leave the hub understanding that “behavioral science is mostly broken” is too crude a summary), decision-usefulness (so executives evaluating which behavioral concepts to actually deploy know which ones are load-bearing), and intellectual honesty (because if you spend a hub on takedowns, you owe readers the parts that worked).
Here is the case for the framing effect, with the legitimate caveats included.
The Asian Disease Problem: 152 Undergraduates, Two Vignettes, One Permanent Result
The foundational paper is Tversky, A., & Kahneman, D. (1981). “The framing of decisions and the psychology of choice.” Science, 211(4481), 453—458. DOI: 10.1126/science.7455683.
The Asian disease problem is the experiment everyone knows. Participants were told the United States was preparing for the outbreak of an unusual Asian disease expected to kill 600 people. Two alternative programs to combat the disease had been proposed. Group one received the programs framed in survival terms:
- Program A: 200 people will be saved.
- Program B: A one-third probability that all 600 people will be saved, and a two-thirds probability that no people will be saved.
Group two received the programs framed in mortality terms:
- Program C: 400 people will die.
- Program D: A one-third probability that nobody will die, and a two-thirds probability that 600 people will die.
The mathematics is identical. Program A is Program C; Program B is Program D. A rational chooser whose preferences satisfy the basic axioms of expected utility theory --- transitivity, independence, the rest --- should respond identically to the two framings, because the framings describe identical outcome distributions over the same population.
The 152 undergraduates did not respond identically. In the survival framing, 72% chose the certain option (Program A) over the gamble. In the mortality framing, 78% chose the gamble (Program D) over the certain option. The same population, the same options, the same outcome distribution --- a 50-percentage-point swing in modal preference, driven entirely by whether the experimenter described the outcomes in terms of who would be saved or who would die.
The interpretation, which Tversky and Kahneman laid out in the paper and developed further in Kahneman, D., & Tversky, A. (1984). “Choices, values, and frames.” American Psychologist, 39(4), 341—350. DOI: 10.1037/0003-066X.39.4.341, is that choosers do not evaluate outcomes against an absolute zero of welfare. They evaluate outcomes against a reference point --- in this case, the implicit reference point of “600 people in the original population.” Outcomes above the reference point are gains; outcomes below are losses. The value function for gains is concave (risk-averse), and the value function for losses is convex (risk-seeking). The survival framing puts the reference point at zero deaths, making both options gains, and the chooser is risk-averse over gains. The mortality framing puts the reference point at zero saves, making both options losses, and the chooser is risk-seeking over losses. Same outcomes, different reference points, different risk preferences --- different choices.
This is not just one vignette. The 1981 paper included additional choice problems demonstrating the same pattern in monetary domains, in gambles over jobs, and in willingness-to-pay decisions. The same framing-induced preference reversal showed up across all of them. The vignette format is what made the Asian disease problem famous, but the underlying claim was always that the phenomenon is general.
The 1981 paper is what eventually made prospect theory (already published in 1979 in Econometrica) inescapable in applied behavioral science. Without the framing demonstrations, prospect theory was a more accurate axiomatic alternative to expected utility theory --- interesting to mathematical psychologists, less obviously consequential outside the academy. With the framing demonstrations, it became visible that a chooser’s reference point was not just an abstract parameter in a value function. It was the dial that practitioners --- doctors describing treatments, financial advisors describing portfolios, insurance underwriters describing policies, governments describing public-health interventions --- were continuously turning without realizing it, often producing entirely different downstream choices than they intended.
Forty Years of Replication: Why The Effect Did Not Die
The natural prediction, in the post-replication-crisis era, would be that an effect this famous, this counterintuitive, and this convenient for the careers of its original authors would turn out to be at least partly overstated. That prediction has been tested repeatedly, and it has been wrong.
The most direct test came from a sequence of attempts in the 1990s and 2000s to argue that the Asian disease problem was a linguistic artifact rather than a genuine framing effect. The most prominent of these was a paper by David Mandel arguing that the original wording was ambiguous --- “200 will be saved” could be interpreted as “at least 200 will be saved” or “exactly 200 will be saved,” and the ambiguity, not the framing, was producing the apparent preference reversal. Mandel reported that when he added the word “exactly” to disambiguate, the framing effect substantially attenuated.
The Data Colada team (Simmons, Simonsohn, and Nelson) re-ran the experiment using Mandel’s procedure with roughly 2.5 times Mandel’s sample size --- 98 participants per cell versus Mandel’s 38. They found a strong framing effect even with the “exactly” wording, statistically significant at p < .001. The artifact explanation did not survive an adequately powered replication. The original effect was real; the apparent attenuation under “exactly” was probably a sample-size issue in Mandel’s data, not a genuine moderator of the phenomenon.
The most decisive single piece of evidence in the modern replication literature is the Asian disease problem’s performance in a Many Labs replication. The effect was statistically significant at conventional levels in 31 of 36 independent laboratories that ran the protocol. That hit rate --- roughly 86% --- is dramatically higher than the hit rates for most of the social-priming and identity-prime findings that headlined the early-2010s replication-crisis wave, which typically replicated in 30% to 50% of labs. The Asian disease problem is among the most reliable demonstrations in the entire Many Labs catalog.
It is worth pausing on what this means. The replication crisis is real, but it is not uniform. Some findings turned out to be roughly as robust as their authors had claimed. Others turned out to be ghosts. Treating the entire pre-2010 catalog of behavioral findings as equally suspect is not skepticism; it is laziness. The framing effect, on every replication metric anyone has run, is in the robust category. If you are going to retain a Bayesian prior that “any pre-2010 behavioral finding is probably wrong,” the framing effect is among the first results that should update your prior toward “but not this one.”
Kühberger 1998: The Meta-Analysis That Quantified It
The canonical meta-analysis of the framing-effect literature is Kühberger, A. (1998). “The influence of framing on risky decisions: A meta-analysis.” Organizational Behavior and Human Decision Processes, 75(1), 23—55. DOI: 10.1006/obhd.1998.2781.
Kühberger aggregated 136 empirical papers reporting framing experiments, covering roughly 30,000 participants and yielding 230 distinct effect-size estimates. The headline finding was a pooled Cohen’s d of approximately 0.31 --- small-to-moderate by behavioral-science conventions, but reliably distinguishable from zero, and notably not driven by a handful of large outlier studies. The framing effect was a real, generalizable phenomenon, not an artifact of any one experimental setup.
More usefully for practitioners, Kühberger documented systematic moderators that determined when the effect was larger versus smaller. Effects were larger when the choice was over hypothetical (rather than incentivized) consequences, when the problem involved larger numbers of affected parties, when the domain was lives-saved rather than money, and when the participants were students rather than experienced professionals. Effects were smaller in incentivized monetary gambles, in within-subjects designs where participants saw both framings, and in domains where participants had substantial prior experience. The meta-analysis was, in a sense, an inventory of the conditions under which prospect theory’s value-function kink at the reference point will and will not produce visible preference reversals in measured choice data.
The Kühberger meta-analysis is the load-bearing reference for anyone making operational decisions about the framing effect. The pooled d = 0.31 is the number to remember if you remember only one number from this article. But the conditions-of-application catalog is what determines whether your specific application will show a large effect, a small effect, or something close to zero.
Steiger & Kühberger 2018: Publication-Bias Correction That Pushed The Effect Up
The honest critique you would expect to surface, in the post-replication-crisis era, is that the Kühberger 1998 number is inflated by publication bias --- by the fact that studies finding framing effects are more publishable than studies finding null framing effects, and that any naive pool of the published literature will overestimate the true effect.
That critique was tested. The relevant paper is Steiger, A., & Kühberger, A. (2018). “A meta-analytic re-appraisal of the framing effect.” Zeitschrift für Psychologie, 226(1), 45—55. DOI: 10.1027/2151-2604/a000321.
Steiger and Kühberger re-analyzed the Kühberger 1998 database using p-curve analysis --- a publication-bias-correction method developed by Simonsohn, Nelson, and Simmons that uses the distribution of just-significant p-values across a literature to infer whether the underlying effect is real or is an artifact of selective reporting. P-curve has the useful property that it can distinguish “true effect with publication bias” from “no true effect, only publication bias and p-hacking.” If the literature is dominated by p-hacked nulls, p-curve will show it; if there is a real effect being inflated by publication bias, p-curve will show that too, and the corrected estimate will sit between the published value and zero.
The result for the framing literature was striking, and not in the direction critics expected. The publication-bias-corrected effect size was d = 0.52 --- considerably larger than the original Kühberger 1998 estimate of d = 0.31, not smaller. There was no evidence of intensive p-hacking in the framing literature. The distribution of significant p-values was consistent with a real, sizable underlying effect being underestimated by the original meta-analytic method, not overestimated.
This is rare. Most behavioral-science literatures, when subjected to publication-bias correction, shrink. Many of them collapse to essentially zero (see, for example, the Maier et al. 2022 re-analysis of the broader nudge literature, discussed in the defaults anti-example). The framing literature is unusual in growing under bias correction. The most plausible explanation is that the underlying phenomenon is robust enough that the file-drawer problem in this specific literature is relatively small --- researchers tend to find the effect, write up the result, and submit it; the missing studies that would mediate publication bias are mostly missing because the effect was rarely null, not because researchers were burying nulls.
The bottom line, after Steiger and Kühberger 2018, is that the framing effect is at minimum a d = 0.31 finding and is plausibly a d = 0.5 or larger finding once you correct for the conservative bias in unadjusted meta-analytic pooling. Either number is in robust-effect territory by behavioral-science standards. The framing literature is not in the fragility zone that consumed so many other findings in this hub.
Connection To Prospect Theory: The Mechanism Is Built In
A common diagnostic for whether a behavioral finding is likely to hold up over time is whether it has a concrete, falsifiable mechanism, ideally one that is over-determined by multiple independently testable predictions. The default effect, in the previous anti-example, has this property --- multiple plausible mechanisms (effort cost, loss aversion, regret avoidance, endorsement signaling) each independently predict the effect, so attacking any one mechanism does not dismantle the prediction. Findings with no concrete mechanism (power posing’s “embodied cognition” mechanism was always vague) tend not to survive.
The framing effect has this property in an even stronger form: the mechanism is explicit, mathematized, and integrated into the dominant alternative to expected utility theory in modern decision science. The mechanism is the kink at the reference point in prospect theory’s value function.
Prospect theory --- introduced in Kahneman, D., & Tversky, A. (1979). “Prospect theory: An analysis of decision under risk.” Econometrica, 47(2), 263—291 --- proposes that the value people assign to outcomes is a function not of absolute wealth but of changes relative to a reference point, with three key features. First, the value function is concave for gains and convex for losses (risk-averse over gains, risk-seeking over losses). Second, it is steeper for losses than for gains (loss aversion; a loss of $100 hurts more than a gain of $100 helps). Third, it kinks discontinuously at the reference point (the slope changes abruptly when you cross from gain to loss territory).
The Asian disease problem is the cleanest demonstration of why this matters. Move the reference point and you move the chooser between the concave and convex regions of the value function. Same outcomes, different valuations, different risk preferences. The framing effect is not an isolated quirk; it is the directly observable consequence of the most empirically established feature of the prospect-theory value function.
This integration matters because it makes the framing effect resilient in a way isolated lab findings are not. The prospect-theory value function has been independently estimated from financial markets data, from taxi-driver labor supply, from professional sports performance, from household consumption decisions, from auction behavior, and from policy-relevant field experiments in domains as varied as energy use, retirement saving, and tax compliance. Every one of those independent estimations gives back something close to the original Kahneman-Tversky parameter values --- concavity coefficient around 0.88, convexity coefficient around 0.88, loss-aversion coefficient around 2.25. The reference-dependence-with-loss-aversion architecture is the most empirically validated structure in modern decision science.
If the framing effect were a fragile isolated lab finding, it would be vulnerable to the kind of mechanism collapse that took down power posing. But it is not isolated. It is the lab-experimental shadow of a value function that has been re-derived from a dozen independent natural settings. To make the framing effect go away, you would have to make prospect theory go away. Nobody has come close to doing that.
Real-World Applications: Where The Framing Effect Actually Operates
The framing effect is not a lab curiosity. It is a continuously active variable in domains where the wording of choices is consequential, and the practitioners in those domains are well aware of it. Three application areas are particularly developed.
Medical informed consent and shared decision-making. When a surgeon describes a procedure as “90% survival rate” versus “10% mortality rate,” patient acceptance rates differ in the direction prospect theory predicts. When an oncologist describes a chemotherapy regimen as “response rate of 30%” versus “70% of patients do not respond,” patient enrollment in treatment differs. Modern medical-education curricula now explicitly teach the framing effect as a clinician-side bias to manage, and shared-decision-making frameworks recommend presenting outcomes in both gain and loss frames so that the chooser’s preferences are not artifacts of the clinician’s chosen framing. The FDA’s draft guidance on informed-consent documentation acknowledges the framing literature as a relevant input. The replication-crisis-era debate has not weakened this application; it has strengthened it, because the framing effect is one of the cognitive findings that survived scrutiny and is therefore safe to teach to physicians as a clinically relevant phenomenon.
Financial product disclosure. When a credit-card issuer describes a fee structure as “$10 surcharge for paying with credit” versus “$10 discount for paying with cash,” choice behavior shifts in the direction prospect theory predicts. The European Union’s PSD2 regulations and similar U.S. consumer-finance regulations are written with explicit awareness of the framing effect --- the rules are deliberately designed to constrain the framings firms are allowed to use, on the theory that framings genuinely move choice rather than just providing different ways to communicate the same information. If the framing effect were not real, those regulations would be paranoid; that they have been promulgated by serious financial regulators across multiple jurisdictions reflects an institutional consensus that the effect is real and consequential.
Public-policy communication. The Asian disease problem itself was a public-policy framing, and modern public-health communication --- around vaccination, around prevention versus treatment, around screening recommendations --- explicitly tests messages in both gain and loss frames. The CDC’s risk-communication guidance acknowledges the framing literature. Behavioral Insights Team field experiments on tax compliance, on benefits uptake, on energy conservation, and on charitable giving have routinely manipulated framing as a primary lever and have documented effects in the direction and magnitude prospect theory predicts. The translation from lab to field is, by behavioral-science standards, unusually clean.
Insurance choice. Levin, Schneider, and Gaeth’s typology of framing effects in Levin, I. P., Schneider, S. L., & Gaeth, G. J. (1998). “All frames are not created equal: A typology and critical analysis of framing effects.” Organizational Behavior and Human Decision Processes, 76(2), 149—188 distinguishes risky-choice framing (the Asian disease type) from attribute framing (the “lean ground beef is 75% lean” versus “25% fat” type) and from goal framing (the “exercise to gain health” versus “exercise to avoid disease” type). All three types are well-documented; the typology is what made it possible to operationalize the framing effect in domains as varied as insurance underwriting (attribute framing of risk pools), product marketing (attribute framing of features), and health-behavior messaging (goal framing of compliance recommendations). The insurance industry’s deductible-versus-coinsurance choice architectures, in particular, have been shown to produce systematically different policy uptake when framed as protection against loss versus opportunity for gain.
The pattern across all four application areas is the same. The framing effect moves real choice in real high-stakes settings, the magnitudes are large enough to matter operationally, the regulators and practitioners who deal with the relevant choice domains are well aware of the effect, and the effect has been documented enough times in enough independent settings that betting against it would be a strange thing for a strategist to do.
What This Means For Strategists And Operators
The practical implications, for someone making real decisions in any of the application areas above --- or in any domain where outcome wording is a design choice rather than fixed --- are roughly:
Assume the framing of any outcome you describe is doing real work on the chooser’s decision. The framing effect is large enough and robust enough that the framing decision is a substantive design choice, not a cosmetic one. If you are presenting outcome data --- to patients, to customers, to investors, to employees, to regulators --- the choice between gain and loss framing is materially affecting downstream choice. Treat it with the seriousness the evidence implies.
Test both framings before committing to one in any high-stakes communication. This is the operational implication of the Levin-Schneider-Gaeth typology. If you do not know in advance whether your audience will respond more strongly to the gain frame or the loss frame in your specific context, run the test --- the magnitudes of framing-driven preference reversal are large enough that the choice will materially affect program effectiveness, ad performance, conversion rate, or whatever your downstream outcome measure is.
Disclose both framings when the chooser’s welfare depends on their making a well-considered choice. This is the ethical implication of the medical-informed-consent application. If your audience is choosing on their own behalf and you have a duty of care to them, the responsible move is to present outcomes in both frames so that the chooser is not making a decision that is an artifact of your framing choice. The same logic applies to product disclosure, to investment communication, and to any other context where you would want your chooser’s expressed preference to reflect their actual preference rather than the format you happened to present.
Update against the broad “behavioral economics is all hype” cynicism. This is the meta-implication. Prospect theory and the framing effect are the central pieces of evidence that careful applied cognitive psychology can produce findings that are robust enough to teach in medical schools, build into financial regulation, and ship into public-health communication. The replication crisis is real but the conclusion is more nuanced than the polemics suggest. The mechanism-grounded, large-effect, field-replicated findings have held up. The flashy single-paper convenience-sample findings often have not. The job is to distinguish which is which, not to throw the entire enterprise out.
Be skeptical when someone claims the framing effect explains your specific domain without measuring it. This is the conservative balancing implication. Even an effect as robust as framing has moderators --- incentivization, expertise, domain familiarity, decision time, repeated exposure. A within-subjects design with experienced professionals making incentivized choices over familiar consequences will often show a much smaller framing effect than the original Asian disease problem implied, sometimes close to zero. The framing effect is real in the aggregate; whether it is large in your specific high-stakes domain is an empirical question that should be answered with your own data, not assumed from the lab literature.
What Makes The Framing Effect Robust
Stepping back, it is worth being explicit about the diagnostic features that make the framing effect a robust finding while so many neighbors in the cognitive-psychology literature collapsed. Four features in particular distinguish this case.
The mechanism is mathematized and externally validated. The framing effect is a directly observable consequence of the kink at the reference point in prospect theory’s value function. That value function has been independently estimated from a long list of natural settings far removed from the lab vignette --- financial market data, taxi-driver labor supply, golfer effort calibration, household saving decisions. Every one of those estimations returns something close to the original Kahneman-Tversky parameter values. The mechanism is not a hand-wave; it is a quantitative architecture that has been re-derived from independent data sources, and the lab finding is what you would predict from the architecture. Failed-replication findings rarely have this property; their proposed mechanisms typically have no independent validation outside the original lab paradigm.
The effect size is moderate but reliable, not small and fragile. A d of 0.31 to 0.52 is not in the obvious-from-the-data range that defaults sit in, but it is not in the fragile-small-effect range that consumed money priming and power posing either. It is large enough to be detectable in moderate-sample studies without statistical heroics, and the publication-bias correction in Steiger and Kühberger 2018 pushed it up rather than down --- a direction of correction that is rare in modern meta-analytic re-appraisals.
The phenomenon replicates across domains, populations, and decades. The Many Labs result of significance in 31 of 36 labs is the strongest single piece of replication evidence. But the broader picture --- successful replications across medical, financial, public-policy, and insurance domains; successful replications across student and professional populations; successful replications using the original Asian disease wording and using newly-constructed vignettes in unrelated outcome areas --- is what makes the case overdetermined. No single replication is doing all the work; the cumulative record is what compels.
The application infrastructure is mature. Medical schools teach it. Financial regulators legislate around it. Public-health agencies build messaging programs that test for it. The Behavioural Insights Team’s field-experiment literature is full of framing manipulations that produce field-relevant effects. When the working infrastructure of multiple serious applied domains takes a phenomenon seriously enough to build operational practice around it, that institutional adoption is itself evidence of robustness. The institutions in question have access to their own outcome data, and they have not moved away from the framing effect over the last forty years.
If you run those four diagnostic features against any other behavioral finding you are considering deploying --- mechanism externally validated, effect size moderate or larger, replicates across domains and populations and decades, application infrastructure mature --- you have a fast filter for distinguishing robust applied cognitive psychology from the long tail of fragile findings that are likely to be the next casualties of this hub.
Sources
- Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453—458. DOI: 10.1126/science.7455683
- Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263—291. DOI: 10.2307/1914185
- Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39(4), 341—350. DOI: 10.1037/0003-066X.39.4.341
- Kühberger, A. (1998). The influence of framing on risky decisions: A meta-analysis. Organizational Behavior and Human Decision Processes, 75(1), 23—55. DOI: 10.1006/obhd.1998.2781
- Levin, I. P., Schneider, S. L., & Gaeth, G. J. (1998). All frames are not created equal: A typology and critical analysis of framing effects. Organizational Behavior and Human Decision Processes, 76(2), 149—188. DOI: 10.1006/obhd.1998.2804
- Steiger, A., & Kühberger, A. (2018). A meta-analytic re-appraisal of the framing effect. Zeitschrift für Psychologie, 226(1), 45—55. DOI: 10.1027/2151-2604/a000321
- Simmons, J., Nelson, L., & Simonsohn, U. (2013). “Exactly”: The most famous framing effect is robust to precise wording. Data Colada, Post [11]. https://datacolada.org/11
Related
Browse the full Replication Crisis Hub for other behavioral findings, including:
- Prospect Theory --- the value-function architecture the framing effect is a consequence of
- Loss Aversion --- robust-ish but smaller than the consulting-deck version
- Defaults / Status Quo Bias --- the other anti-example in this hub
- Mental Accounting --- the other major behavioral-economics finding that survived scrutiny
- Endowment Effect --- a prospect-theory-adjacent finding with a more mixed replication record
FAQ
Is the framing effect actually irrational?
This is contested. The strict expected-utility-theory answer is yes: a chooser whose preferences depend on the framing of mathematically identical options is violating the description-invariance axiom of EUT, and that is what “irrational” means in EUT’s formal sense. The prospect-theory answer is more nuanced: reference-dependence is a real and stable feature of how people evaluate outcomes, and a value function defined over changes-from-reference is not obviously irrational --- it might be the ecologically rational architecture for a chooser embedded in a real economy with anchoring norms and adaptive expectations. The framing effect is unambiguously a violation of EUT; whether it is “irrational” in a deeper sense is a question more about the right definition of rationality than about the data.
Did the Asian disease problem replicate after the replication-crisis era?
Yes, repeatedly. The clearest single result is from a Many Labs replication effort that found the effect statistically significant at conventional levels in 31 of 36 independent laboratories --- a roughly 86% hit rate, dramatically higher than the typical hit rate for social-priming findings of the same era. The Data Colada team also re-ran a critic’s attempted refutation with adequate sample size and found the original effect held up even with the disambiguating “exactly” wording the critic had argued was decisive. The framing effect is among the most replicable findings in the entire post-2010 catalog.
What is the actual effect size?
Kühberger 1998 (136 papers, ~30,000 participants) found a pooled Cohen’s d of approximately 0.31. The Steiger & Kühberger 2018 publication-bias-corrected re-appraisal, using p-curve analysis, found a corrected d of approximately 0.52. Most behavioral findings shrink under bias correction; the framing literature is unusual in that the corrected effect was larger than the uncorrected estimate. Both numbers are in robust-effect territory by behavioral-science standards.
Why include framing as an anti-example in this hub?
Because the hub is meant to leave readers calibrated, not just cynical. Most behavioral-science findings collapsed under scrutiny; some did not. The framing effect is one of the central examples of what survives --- mechanism-grounded, large-effect, field-replicated, application-mature. Including the survivors alongside the casualties is how readers leave with an accurate picture of which behavioral concepts they can confidently deploy versus which they cannot.
Does the framing effect work in business and growth contexts?
The effect is well-documented in marketing-message testing, in pricing communication, in product-feature framing, and in churn-retention messaging. Levin et al.’s attribute-framing and goal-framing typologies are direct fits to product marketing. The honest caveat is that effect sizes in business contexts vary widely depending on stakes, expertise, decision time, and incentive structure --- the effect is real but its magnitude in your specific domain should be measured with your own A/B test data, not assumed from the lab literature. Treat framing as a high-prior intervention worth testing, not as a guaranteed lift.
How does the framing effect interact with loss aversion?
They are two faces of the same underlying architecture in prospect theory. The value function is steeper for losses than for gains (loss aversion) and kinks at the reference point (which is what makes framing matter). The framing effect is, in a sense, loss aversion observed under different reference-point assignments. If you removed loss aversion from prospect theory, the framing effect would substantially weaken. If you removed reference-dependence, the framing effect would disappear entirely. They are mathematically coupled --- which is why both have to be evaluated together and why critiques of one typically constrain critiques of the other.
What is the strongest single piece of evidence that the framing effect is robust?
The Many Labs replication result of significance in 31 of 36 laboratories is the strongest single piece of evidence. The second-strongest is the Steiger & Kühberger 2018 publication-bias correction that pushed the meta-analytic effect size up rather than down. The third-strongest is the cross-domain field-experimental record, particularly the Behavioural Insights Team’s accumulated portfolio of framing-manipulation field trials in tax compliance, energy use, and benefits uptake. Any one of these would be substantial evidence; the cumulative case across all three, together with the prospect-theory mechanism architecture, is what places framing firmly in the robust-finding category.
Should I trust the framing effect more than the default effect?
Roughly equally, but for different reasons. Defaults have a larger raw effect size (Jachimowicz 2019 reports a pooled d of 0.68 versus framing’s d of 0.31 to 0.52) and a cleaner field-experiment-with-administrative-outcomes record. Framing has a more mathematized mechanism architecture (the prospect-theory value function is more developed than the multiple plausible mechanisms behind defaults) and a longer replication record at the lab level (the Many Labs hit rate is exceptional). For most practical applications, both are in the “high-confidence intervention” tier, and the choice between them is dictated by which lever is available in your specific application rather than by relative evidence quality.
replication-crisis framing-effect kahneman-tversky behavioral-economics evidence-evaluation