April 23, 1985. Coca-Cola CEO Roberto Goizueta steps onto a stage at Lincoln Center and announces that the company is changing its flagship recipe for the first time in 99 years. The new formula is called New Coke. Coca-Cola has spent four years and over $4 million on market research validating the change. Their internal team has run 191,000 blind taste tests. 53% of consumers preferred New Coke over the original formula. Pepsi was eating Coca-Cola's lunch in the "Pepsi Challenge" advertising campaign. The data was overwhelming. The new formula tested better. The decision was rational.

By July 11, 1985 — less than three months later — Coca-Cola was bringing the original formula back as "Coca-Cola Classic." The grassroots backlash had been one of the worst PR crises in American corporate history. A consumer group called the "Old Cola Drinkers of America" filed a class-action lawsuit. The customer service hotline was receiving 8,000 angry calls a day. Goizueta later called the launch one of the biggest mistakes of his career.

What's interesting about this story isn't the failure. It's why the failure happened.

The taste tests were not wrong. Customers genuinely did prefer the new formula when they didn't know which one they were drinking. The market research department had not made an analytical error. They had measured the wrong thing.

The thing the surveys were measuring — which liquid tastes better in a one-ounce sip — turned out to have almost no relationship to what Coca-Cola's customers actually wanted from the brand. What customers wanted was the Coca-Cola of their childhood. The Coca-Cola of summer baseball games and grandparents' refrigerators. The Coca-Cola that meant home. None of that emotional architecture was visible in a blind taste test. The research instrument didn't measure it because the research instrument couldn't measure it.

This is, in a single $4 million case study, the central problem with most customer research that has ever been done. And the underlying behavioral economics is older and more uncomfortable than most marketers realize.

The Paper That Should Have Ended the Modern Survey Industry

In 1977, two psychologists — Richard Nisbett at the University of Michigan and Timothy Wilson at the University of Virginia — published a paper in Psychological Review called "Telling More Than We Can Know: Verbal Reports on Mental Processes." It is, by most counts, one of the most-cited papers in the history of social psychology.

The argument was uncomfortable. Nisbett and Wilson reviewed decades of experiments in which subjects were asked to explain why they had made a particular choice or held a particular preference, and then compared those explanations against the experimental variables that had actually driven the behavior. The match rate was essentially zero.

In one of their experiments, subjects were asked to evaluate four pairs of women's stockings displayed in a row from left to right. Subjects overwhelmingly preferred the rightmost pair — a well-replicated finding called the position effect. When asked why they had preferred that pair, subjects gave detailed reasons: superior knit, better elasticity, more attractive color. Not one subject mentioned position. When experimenters explicitly told subjects that the position was the only factor that varied (the stockings were physically identical), almost every subject denied that their judgment had been influenced by position at all.

Nisbett and Wilson's conclusion was direct: we do not have privileged access to our own decision-making processes. We have access to our decisions. We have access to the post-hoc stories we tell about our decisions. We do not have access to the actual machinery that produced the decision. When asked to introspect, we confabulate — we generate plausible-sounding explanations that have almost no relationship to the actual cause.

If this is true — and forty-five years of follow-up research strongly suggests it is — then almost every customer survey ever conducted has been measuring something other than what it claimed to be measuring.

Choice Blindness

The most dramatic experimental demonstration of this came in 2005, when two Swedish psychologists — Petter Johansson and Lars Hall at Lund University — designed a series of experiments they called choice blindness studies.

In the original version, experimenters showed subjects two photographs of women's faces and asked them to choose which face they found more attractive. The experimenter then handed the subject the chosen photograph and asked them to explain why they had chosen that face.

Here's the trick. Through sleight of hand — a card-trick technique — the experimenter sometimes swapped the photographs. The subject would point at face A, the experimenter would slide face B into their hand, and then ask them to justify their choice of face B.

Roughly 75% of the time, subjects did not notice the switch. They took the photograph they had supposedly chosen, looked at it, and produced detailed, confident, internally-coherent explanations for why they had picked this face — even though they had pointed at a different face seconds earlier. They invented preferences they had never had, and they did so without any sense that they were inventing anything.

Johansson and Hall replicated this with grocery shopping, political opinions, and ethical dilemmas. The result was consistent across domains. Humans do not just confabulate occasionally. We confabulate by default.

If you've ever read Daniel Kahneman's Thinking, Fast and Slow, this is the territory Kahneman covers in his chapters on System 1 — the fast, intuitive system that produces most of our decisions before our deliberative System 2 even engages. Phil Barden in Decoded walks through the fMRI evidence that this is not a metaphor: the brain regions associated with conscious deliberation light up after the decision has already been made elsewhere. Our experience of "deciding" is largely a reconstruction.

Why This Breaks Almost Every Customer Survey

The operational implication is stark. If you ask a customer "Why did you buy our product?" or "What features matter most to you?" or "What would make you more likely to recommend us to a friend?" — you are not getting access to the actual machinery of their decision. You are getting access to the story they tell themselves about the decision. The story is plausible, internally coherent, and almost entirely disconnected from the real driver.

This is why so many products built on "voice of the customer" research fail in market. Customers said they wanted X. The company built X. Customers did not buy X. The company concludes that customers were lying. They were not lying — at least, not in any conscious sense. They were confabulating in good faith. The thing they said they wanted was a post-hoc explanation for behaviors they couldn't actually introspect on.

Henry Ford's apocryphal line — "If I had asked people what they wanted, they would have said faster horses" — is probably misattributed, but it captures the right insight. People are very good at describing variations on what they already know. They are systematically bad at predicting their own reactions to anything genuinely new.

Steve Jobs put it more bluntly: "It's really hard to design products by focus groups. A lot of times, people don't know what they want until you show it to them." This is not arrogance. It is an accurate description of how human introspection works.

What Actually Works Instead

If self-report data is fundamentally unreliable, what do you do? The behavioral economics literature has converged on three answers:

1. Observe behavior, don't ask about it. Watch what customers actually do in the experience. Track where they click, where they pause, where they abandon. Behavior is the revealed preference. Self-report is the constructed preference. The two often don't agree. Trust the behavior.

2. Run controlled experiments. A/B tests are the gold standard for understanding what customers respond to, because they measure actual behavior under controlled variation. If you've read Ron Kohavi, Diane Tang, and Ya Xu's Trustworthy Online Controlled Experiments (the closest thing the field has to a manual), you'll see this argument made at length. Kohavi reports that the correlation between manager intuition about which test variant would win and the actual winning variant is roughly the same as flipping a coin. The data beats the introspection at a non-trivial rate.

3. Use the "Jobs to be Done" framework. Clayton Christensen, in Competing Against Luck, argued that the right question is never "what features do customers want?" It's "what job is the customer hiring this product to do?" The framing change matters because it pushes the inquiry away from stated preferences and toward functional outcomes — which customers can more reliably describe because they're concrete, not introspective.

There's a useful book by the marketing researcher Gerald Zaltman called How Customers Think that argues, with substantial neuroscience backing, that roughly 95% of consumer decision-making happens below conscious awareness. The implication isn't that customer research is worthless. It's that self-report customer research is a much weaker tool than the industry currently treats it as, and that behavioral customer research — observation, experimentation, behavioral data — should be doing most of the work.

The Honest Operational Stance

The hardest thing to internalize about all of this is the humility it requires.

Most companies treat customer surveys as ground truth — as if the customer is the authority on their own preferences and the company just has to listen carefully. The Nisbett-Wilson finding, the Johansson-Hall finding, the Kahneman framework, the New Coke disaster all point the other way. The customer is not the authority on their own decision-making. The customer is operating on the same incomplete introspective access to their own brain that the rest of us are. They will tell you stories about their preferences in good faith, and the stories will be largely confabulation.

The operational stance that follows from this isn't "ignore customers." It's "observe customers, experiment with customers, and treat their stated explanations as one weak signal among many." Watch what they do. Test what changes the behavior. Use the surveys as directional input, not as truth.

Coca-Cola's 1985 research department wasn't incompetent. They were running a well-designed study against the wrong question, and they trusted the answers because the methodology looked rigorous. The methodology was rigorous. The question was wrong.

If you take one thing from the New Coke story, take this: the gap between what customers say and what customers do is not a research bug. It is a feature of how human cognition works. The methodology that produces the gap is not fixable by being a better researcher. It is fixable only by changing the methodology entirely — by measuring behavior instead of measuring stated belief.

The brands that win in their categories are mostly the brands that figured this out the hard way, usually after a New Coke of their own.

Better to learn the lesson without paying the $4 million tuition.

Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified behavioral economist (1 of ~1,000 worldwide). 200+ A/B tests across energy, SaaS, fintech, e-commerce, and marketplace verticals.