The framework that aged well

Most of what this hub covers is social psychology that fell apart. Power posing didn’t survive its first big replication. Bargh’s elderly-priming walk-slow effect went down in 2012. Bandura’s Bobo doll generalizations got narrower with every meta-analysis. Cognitive dissonance — the framework Festinger is more famous for — still survives as a general claim about psychological discomfort, but its specific behavioral predictions (the canonical $1/$20 induced-compliance result, free-choice spreading-of-alternatives, effort justification at the magnitudes originally reported) are contested every decade and shrink with pre-registration.

Leon Festinger’s other 1954 paper — the social comparison theory one, published in Human Relations — is the opposite story. It has held up across 70 years of accumulating evidence. The 2002 Suls, Martin & Wheeler review in Current Directions in Psychological Science concluded the basic mechanisms were well-supported. The wave of social-media research starting around 2014 (Vogel et al., Fardouly et al.) didn’t refute it — it extended it into a domain Festinger never imagined, and the predictions worked. The Wood 1989 Psychological Bulletin review found the literature converged on the central claims even where it complicated the periphery.

This article is an anti-example. The replication crisis is real, and most of the famous social-psychology findings of the 1950s–1970s are in trouble. Festinger’s social comparison theory is one of the conspicuous exceptions, and understanding why matters more than memorizing a list of effects to distrust. If you’re an evidence-evaluating practitioner — marketer, product manager, behavioral scientist — the question isn’t “which findings should I keep?” but “what does a finding that survives 70 years look like, and how does that differ from one that doesn’t?”

What Festinger actually proposed in 1954

The paper is “A theory of social comparison processes,” Human Relations 7(2), pages 117–140 (DOI: 10.1177/001872675400700202). It’s structured as a numbered set of hypotheses with corollaries — nine of them — and that structure is part of why the theory aged well. Festinger didn’t write a vibey narrative; he wrote falsifiable claims, each scoped narrowly.

The core hypotheses, paraphrased:

  1. People have a drive to evaluate their opinions and abilities. This isn’t optional — Festinger argued it’s a basic motivation, because accurate self-assessment has survival and social value.
  2. When objective non-social standards aren’t available, people evaluate by comparing to other people. You can measure your height with a tape measure (objective standard), but you can’t measure your “competence as a manager” or “correctness of your opinion on policy X” the same way. So you compare.
  3. The tendency to compare with a specific other decreases as that other’s opinion or ability diverges from your own. You compare to similar others. You don’t seriously compare your tennis serve to Serena Williams’ or your political opinion to a person with diametrically opposed worldview — those comparisons are uninformative.
  4. There is a unidirectional drive upward in the case of abilities that is largely absent for opinions. You want to be better at things; you don’t necessarily want your opinions to be “more X.”
  5. There are non-social restraints making it difficult or impossible to change one’s ability. You can change your opinion in an afternoon; you can’t change your running speed in an afternoon. This creates asymmetric outcomes from comparison.
  6. The cessation of comparison with others is accompanied by hostility or derogation to the extent that continued comparison with them implies unpleasant consequences.
  7. Any factors which increase the importance of some particular group as a comparison group will increase the pressure toward uniformity concerning abilities and opinions within that group.
  8. If persons who are very divergent from one’s own opinion or ability are perceived as different from oneself on attributes consistent with the divergence, the tendency to narrow the range of comparability becomes stronger.
  9. When there is a range of opinion or ability in a group, the relative strength of the three manifestations of pressures toward uniformity (changing own position, changing others’ positions, restricting the range of comparison) will be different for those near the mode of the group than for those distant from the mode.

That’s the whole theory. Notice what it doesn’t do: it doesn’t make a single sensational prediction that could be turned into a viral TED talk. It makes nine narrowly-scoped claims that together describe how people calibrate self-assessments through social information.

Why the structure mattered

Most failed social-psychology effects share a structural feature: they’re a single dramatic claim (“priming elderly words makes you walk slower,” “holding a power pose changes your testosterone,” “subliminal exposure to dollar signs makes you more individualistic”) tested in one or two studies with small samples, and then promoted as a general law. Festinger’s theory was the opposite. Nine hypotheses, each tied to mechanism, each generating distinct testable predictions across multiple domains (opinions, abilities, group dynamics, individual self-evaluation). The theory was a system, not a finding.

When a system has nine load-bearing claims and most of them survive empirical testing in their original domain and generalize cleanly to new domains (social media wasn’t on Festinger’s radar in 1954, and the theory still works there), that’s strong evidence the underlying construct corresponds to something real about human cognition. Compare this to the canonical pattern of failed replications: one finding, one paradigm, dependence on specific stimuli and instructions that don’t generalize beyond the original lab.

The Wood 1989 consolidation

By the late 1980s the theory had been around for 35 years and the literature was sprawling. Joanne Wood’s 1989 review in Psychological Bulletin (“Theory and research concerning social comparisons of personal attributes,” 106(2), 231–248, DOI: 10.1037/0033-2909.106.2.231) pulled the threads together and made several updates that mattered.

Wood pointed out that Festinger’s “similarity” hypothesis was more complicated than the original paper suggested. People don’t always compare with the most similar other — they sometimes compare on related attributes (similar on dimensions related to the dimension being evaluated, not the dimension itself). A novice tennis player doesn’t compare their forehand to other novices’ forehands necessarily; they may compare to other people who started learning at the same age, or who have similar athletic backgrounds. This refinement — “related attributes” rather than just “similar performance” — survived subsequent testing and is now standard.

Wood also documented that comparison choice was strategically motivated, not just informational. People sometimes choose downward comparisons (comparing to someone worse off) for mood repair after threat, and choose upward comparisons (comparing to someone better off) for self-improvement motivation. Festinger had emphasized the unidirectional drive upward for abilities; Wood showed the motivational picture was richer. This was an extension of the theory, not a refutation — and it set up the next 20 years of work cleanly.

Critically, Wood’s review did not conclude “the theory is wrong” or “the effects don’t replicate.” It concluded the theory was largely supported, with refinements needed at the edges. This is what a healthy research program looks like: cumulative refinement, not collapse.

The Suls, Martin & Wheeler 2002 synthesis

The most-cited consolidation is Suls, Martin & Wheeler, “Social comparison: Why, with whom, and with what effect?” Current Directions in Psychological Science 11(5), 159–163 (DOI: 10.1111/1467-8721.00191). It’s a short, dense review article — six pages — and it functions as a status report on the field nearly 50 years after Festinger’s original.

The headline conclusion: social comparison is robust, ubiquitous, and consequential. People compare under a wide range of conditions including some Festinger didn’t predict (e.g., people compare even when objective standards are available, because social information adds context the objective standard doesn’t capture). The direction of comparison (upward vs downward) and the kind of self-evaluation that results depend on the comparer’s goals, mood, and self-esteem level — not just on availability of comparison targets.

Suls, Martin & Wheeler organized the literature around three questions, which is why the review is so heavily cited:

  • Why do people compare? Self-evaluation, self-improvement, and self-enhancement — three distinct motives, each predicting different patterns of comparison choice.
  • With whom do people compare? Similar others (Festinger’s original claim) for self-evaluation, better-off others for self-improvement, worse-off others for self-enhancement. The motive determines the target.
  • With what effect? Upward comparisons can be inspiring (assimilation) or demoralizing (contrast). Downward comparisons can be reassuring (contrast) or threatening (assimilation, when you fear sliding down to the comparison target’s position). The effects depend on whether the comparer feels similar to or different from the target on dimensions relevant to the outcome.

That last finding — that the same comparison can produce opposite effects depending on perceived self-target similarity — is the kind of moderator that, in failing fields, is often used post hoc to explain inconsistent results. In social comparison research it was predicted, tested, and confirmed across multiple paradigms. The difference matters.

Vogel 2014: the theory walks into Facebook

Erin Vogel and colleagues published “Social comparison, social media, and self-esteem” in Psychology of Popular Media Culture 3(4), 206–222 (DOI: 10.1037/ppm0000047) in 2014. The paper presented two studies. The first was correlational: heavier Facebook use was associated with lower trait self-esteem, and that association was mediated by frequency of social comparison on Facebook. The second was experimental: participants were randomly assigned to view a Facebook profile that depicted either upward (high social activity, healthy lifestyle) or downward (low social activity, unhealthy lifestyle) comparison information. Upward exposure produced lower state self-esteem; downward exposure produced higher state self-esteem.

This is exactly what Festinger’s framework, refined by Wood and Suls/Martin/Wheeler, would predict. Take a population motivated to evaluate themselves, put them in an environment dense with information about similar others, vary the direction of comparison, and the predicted self-evaluation effects appear. Vogel et al. didn’t have to retrofit the theory to social media — the theory was already general enough to make the prediction.

The study has its own limitations (modest sample sizes, single platform, US college students), and the field has since shown that social-media comparison effects are heterogeneous across users — some people are more susceptible, some platforms and content types produce stronger effects, and the effect of “Facebook use” on well-being depends heavily on what kind of use. None of that overturns the comparison-process explanation; it refines it.

Fardouly 2015: body image and Facebook

Jasmine Fardouly et al., “Social comparisons on social media: The impact of Facebook on young women’s body image concerns and mood,” Body Image 13, 38–45 (DOI: 10.1016/j.bodyim.2014.12.002), published in 2015, made the connection more specific. Women were randomly assigned to spend time on Facebook, an appearance-neutral website, or a magazine website. Facebook exposure (relative to the appearance-neutral control) produced more negative mood. For women high in the trait tendency to compare appearance, Facebook exposure also produced greater facial, hair, and skin-related discrepancies between current and ideal self-image.

Again: this is what social comparison theory plus Wood’s downward/upward extensions plus the Suls/Martin/Wheeler moderator framework would predict. Appearance is a domain where people have a unidirectional drive upward (we want to look better, like Festinger’s original ability claim), where objective standards are weak (beauty is socially constructed), and where similar-other comparison targets are dense and curated on visual social platforms. The theory says comparison effects should be strong in exactly this context, and they are.

It’s important to note that subsequent literature on social media and body image has been heterogeneous. Some studies find no causal effects, some find effects only for specific subgroups, some find effects only on specific outcomes. The social comparison mechanism — the process by which people evaluate themselves against perceived others — is well-supported. The aggregate claim “social media use causes body dissatisfaction” is much more contested, and that’s because aggregate use combines many different kinds of comparison processes and many different individual susceptibilities.

Upward vs downward comparison: the predicted asymmetry

A central refinement that emerged in the 1980s and 1990s is that upward and downward comparisons have differential effects on mood and motivation, and the pattern depends on whether the comparer is in a self-evaluation, self-improvement, or self-enhancement frame.

The simplified empirical picture:

  • Upward comparison (target is better off than self):
    • Inspires (assimilation effect) when the target feels attainable and the comparer is in a self-improvement frame
    • Demoralizes (contrast effect) when the target feels unattainable or the comparer is in a self-enhancement frame
  • Downward comparison (target is worse off than self):
    • Reassures (contrast effect) when the comparer is feeling threatened or in a self-enhancement frame
    • Threatens (assimilation effect) when the comparer fears sliding into the target’s position

This asymmetry is well-replicated. It’s also the source of much of the apparent inconsistency in old social-comparison studies that didn’t control for motivational frame — and it’s the source of much of the heterogeneity in modern social-media research that doesn’t measure individual differences in comparison motive.

A productive theory generates moderators that resolve apparent inconsistencies. A failing theory accumulates inconsistencies that no one can resolve. Festinger’s framework has done the former.

Why this survived where Festinger’s other theory’s specific claims didn’t

Festinger’s name shows up twice in any survey of mid-20th-century social psychology — once for cognitive dissonance (the 1957 book, the 1959 Festinger & Carlsmith study, the broad claim that inconsistency between cognitions produces motivated change), and once for social comparison theory (the 1954 paper). Both have a strong central insight. But the trajectories diverged.

Cognitive dissonance is a single mechanism (psychological discomfort from inconsistency) that was applied to a huge variety of specific paradigms — induced compliance, free choice, effort justification, hypocrisy induction, post-decisional regret. Many of these specific paradigms have had replication trouble. The general claim that people experience and try to reduce inconsistency-related discomfort survives; the more specific claim that this happens at the magnitudes and in the directions originally reported in the 1950s–1960s paradigms is on shakier ground.

Social comparison theory is a system of nine narrowly-scoped hypotheses about a different process (self-evaluation through reference to others) that generated a research program of moderators and refinements, not paradigm-specific predictions that needed to hold up exactly. When Wood (1989) refined the similarity hypothesis to “related attributes,” the theory got better. When Suls/Martin/Wheeler (2002) articulated the three motives, the theory got better. When Vogel (2014) extended it to social media, the theory got better. There was no canonical paradigm whose collapse would discredit the framework — the framework was distributed across many paradigms and contexts.

This is a useful diagnostic for evaluating any psychology claim:

  • Is the theory a single dramatic prediction or a structured system of moderated claims?
  • Does the supporting literature get better as it accumulates, or does it accumulate moderators that look like post-hoc rescues?
  • Does the theory generalize cleanly to new domains, or does each new domain require retrofitting?
  • Are the predictions risky (could be falsified by data the original researchers didn’t have) or safe (consistent with many possible outcomes)?

Social comparison theory scores well on all four. Most of the famously failed effects score poorly on most of them.

Strategist takeaway

If you build products, run experiments, or design marketing systems, here’s the practical version.

One. When you’re evaluating a behavioral claim, ask whether it’s a system or a single finding. Systems with multiple load-bearing predictions and clear mechanism are more bet-able than isolated findings, even when the isolated finding has a more compelling demo. Social comparison theory is a system; “power posing changes hormones” was a single finding.

Two. Use the comparison mechanism in product design, but use the moderated version, not the cartoon version. Showing users how they compare to others is not uniformly motivating — it depends on direction (upward/downward), perceived similarity to the comparison target, and the user’s current motivational frame (improvement, evaluation, enhancement). Fitness apps that show only upward comparisons to elite users routinely demoralize new users. Effective designs show comparisons to similar-stage users (related attributes) and let users frame the comparison as self-improvement information.

Three. When you’re explaining counterintuitive product behavior — users churning faster on a feature that’s “objectively better,” users disengaging from leaderboards, users reporting lower satisfaction after a social-feed redesign — social comparison is usually one of the top candidate mechanisms. Most of the time it’s invisible because users don’t say “I felt worse after seeing other people’s success.” They just churn. Suls/Martin/Wheeler’s three-motive framework is a useful diagnostic checklist.

Four. Don’t confuse “social comparison works” with “any leaderboard or social proof feature will increase engagement.” The mechanism is real; the implementation determines whether you trigger improvement motivation or self-enhancement defense (i.e., disengagement). The number of social-proof designs that backfire because they trigger unattainable-upward-comparison demoralization is large, especially in fitness, finance, and learning products.

Five. Use Festinger’s theory the way the field uses it: as a robust framework whose central claims are well-supported and whose moderators tell you when the effects flip direction. Don’t use it as an excuse to add a leaderboard.

Sources

  • Festinger, L. (1954). A theory of social comparison processes. Human Relations, 7(2), 117–140. DOI: 10.1177/001872675400700202
  • Wood, J. V. (1989). Theory and research concerning social comparisons of personal attributes. Psychological Bulletin, 106(2), 231–248. DOI: 10.1037/0033-2909.106.2.231
  • Suls, J., Martin, R., & Wheeler, L. (2002). Social comparison: Why, with whom, and with what effect? Current Directions in Psychological Science, 11(5), 159–163. DOI: 10.1111/1467-8721.00191
  • Vogel, E. A., Rose, J. P., Roberts, L. R., & Eckles, K. (2014). Social comparison, social media, and self-esteem. Psychology of Popular Media Culture, 3(4), 206–222. DOI: 10.1037/ppm0000047
  • Fardouly, J., Diedrichs, P. C., Vartanian, L. R., & Halliwell, E. (2015). Social comparisons on social media: The impact of Facebook on young women’s body image concerns and mood. Body Image, 13, 38–45. DOI: 10.1016/j.bodyim.2014.12.002

FAQ

Is Festinger’s social comparison theory the same as his cognitive dissonance theory?

No. They’re distinct theories that share an author. Social comparison theory (1954) is about how people evaluate their own opinions and abilities by comparison to others. Cognitive dissonance theory (1957) is about psychological discomfort from holding inconsistent cognitions and the motivation to reduce it. The social comparison framework has held up better than the specific behavioral predictions of dissonance theory’s canonical paradigms.

Has social comparison theory ever failed a replication?

Specific applications and moderators have produced mixed results, especially around what kinds of social-media use produce well-being effects. But the core mechanism — that people evaluate themselves through reference to others, prefer similar others on related attributes for evaluation, and experience differential effects from upward and downward comparisons — has accumulated converging evidence over 70 years and survived in updated form.

Is the theory falsifiable?

Yes. Each of Festinger’s nine hypotheses generates predictions that could in principle have failed. The “similarity” hypothesis was modified by Wood (1989) when empirical work showed people compare on related attributes rather than just similar performance. That’s the theory being revised by data — exactly what falsifiable theories should do.

Why does this matter for marketing or product design?

Because social comparison is one of the most reliable behavioral mechanisms you can design around — but only if you use the moderated version. Naive leaderboards and “users like you achieved X” features often backfire because they trigger unattainable-upward-comparison demoralization rather than self-improvement motivation. The 70-year literature gives you the moderators to design with intent.

How does this compare to other 1950s social-psychology theories?

Most fared worse. Many famous 1950s–1970s findings — the Asch conformity rates, Milgram’s specific obedience percentages, Bandura’s Bobo doll generalizations, Festinger & Carlsmith’s specific dollar-magnitude dissonance findings — have had replication trouble or large effect-size reductions on re-examination. Social comparison theory is one of the conspicuous exceptions.

Is there a single best modern citation if I only read one thing?

The Suls, Martin & Wheeler 2002 Current Directions article is six pages, summarizes 48 years of literature, and organizes everything around three questions (why, with whom, with what effect). It’s the right entry point.

What’s the practical limit of the theory?

Like any behavioral framework, it’s a description of mechanism, not a recipe for specific outcomes. It tells you that people compare and the broad direction of effects under specified conditions. It doesn’t tell you exactly how a specific design choice in your product will affect a specific user’s behavior. You still have to test.

Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified behavioral economist (1 of ~1,000 worldwide). 200+ A/B tests across energy, SaaS, fintech, e-commerce, and marketplace verticals.