Rosenhan's "On Being Sane in Insane Places": The Foundation Study Cahalan Showed Was Largely Fabricated

Atticus Li

← Blog · replication-crisis

Rosenhan's "On Being Sane in Insane Places": The Foundation Study Cahalan Showed Was Largely Fabricated

In 1973, eight "pseudopatients" allegedly checked into 12 mental hospitals, fooled every psychiatrist, and triggered a half-century of policy reform. In 2019, Susannah Cahalan went looking for the evidence. Most of it was missing — and what she did find contradicted the published paper.

By Atticus Li May 18, 2026 26 min read

In 1973, eight “pseudopatients” allegedly checked into 12 mental hospitals, fooled every psychiatrist, and triggered a half-century of policy reform. In 2019, Susannah Cahalan went looking for the evidence. Most of it was missing --- and what she did find contradicted the published paper.

The canonical story goes like this. Eight ordinary people --- a psychology professor, a graduate student, a housewife, a pediatrician, three more psychologists, and a painter --- walked into 12 different psychiatric hospitals across five US states between 1969 and 1972. Each pseudopatient told the admitting psychiatrist the same single, vague complaint: they had been hearing voices that said the words “empty,” “hollow,” and “thud.” Beyond that single symptom, they answered every question truthfully and behaved entirely normally.

Every one of them was admitted. Seven were diagnosed with schizophrenia, one with manic-depressive psychosis. Once inside, they immediately stopped reporting any symptoms and behaved as their normal selves. They were nevertheless held for an average of 19 days each --- one for 52 days --- and all but one were eventually discharged with a diagnosis of “schizophrenia in remission.” Their normal behavior on the ward was systematically reinterpreted by staff as evidence of pathology. Note-taking became “writing behavior” in their charts. Pacing because they were bored was logged as agitation.

The paper, published in Science in January 1973, was titled “On Being Sane in Insane Places.” Its author was David Rosenhan, a Stanford law-and-psychology professor. The conclusion --- that psychiatric institutions could not reliably distinguish the sane from the insane --- detonated through the field. It accelerated the deinstitutionalization movement that emptied state mental hospitals across America. It became the haunting backdrop to the creation of DSM-III in 1980, which scrapped Freudian categories in favor of symptom checklists explicitly designed to “pass the Rosenhan test.” It became foundational scripture for the anti-psychiatry movement. It appeared in nearly every introductory psychology textbook published in the following five decades.

In 2019, the journalist Susannah Cahalan published the results of a multi-year investigation into the study. She had spent years trying to find the eight pseudopatients, locate the hospital records, and verify the published claims. What she found --- documented in her book The Great Pretender --- is one of the most consequential pieces of investigative work in modern social science. It is also one of the most uncomfortable.

She could only identify two of the eight pseudopatients. Rosenhan’s own medical records, which she obtained, sharply contradicted what he had published. The one other pseudopatient she could locate gave an account that diverged from the paper. And the ninth pseudopatient, “Harry Lando,” had been quietly dropped from the published results --- apparently because his experience didn’t fit the narrative.

This article walks through what Rosenhan claimed, what Cahalan found, and what the gap between them means for anyone whose job involves trusting --- or building strategy on top of --- famous behavioral science findings.

What Rosenhan 1973 Claimed

The published paper’s methodology was deceptively simple. According to Science, Rosenhan and seven recruits --- five men, three women, drawn from across psychology, medicine, art, and homemaking --- presented themselves at 12 different psychiatric hospitals across five states. They used pseudonyms and false occupations to prevent identification. They each reported a single auditory hallucination: voices that said “empty,” “hollow,” and “thud” in what the pseudopatients said sounded like their own gender. These specific words were chosen, Rosenhan wrote, because they suggested existential despair without matching any published symptom pattern for schizophrenia.

Beyond this one fabricated symptom, the pseudopatients gave accurate life histories. They reported genuine relationships, real frustrations, normal childhood experiences. The point was that everything except the three-word hallucination was real. If psychiatrists were doing diagnostic work that could distinguish illness from health, the genuine biography should anchor the assessment. The fake symptom alone shouldn’t tip them into a major mental-illness diagnosis.

Once admitted, the pseudopatients dropped the symptom entirely. They told staff the voices had stopped. They behaved normally on the wards. They cooperated with treatment, took --- or pretended to take --- their medications, and waited to be released. Their stays ranged from 7 to 52 days. The average was 19 days. They were eventually released with the diagnosis “schizophrenia in remission” rather than as misdiagnosed.

Inside the wards, the pseudopatients also reported a striking observation about how staff reinterpreted normal behavior. One pseudopatient took copious notes; the staff logged this in the chart as “the patient engages in writing behavior.” Another paced because he was bored; the chart described agitation. Conversations with staff lasted on average just over six minutes per day. Most contact with patients came from low-status orderlies and aides.

A second part of the study, often less discussed, claimed an inverse demonstration. Rosenhan, the paper said, told one teaching hospital that over the next three months, more pseudopatients would attempt to gain admission. In fact, he sent none. The staff at that hospital flagged 41 of 193 admitted patients as suspected pseudopatients and identified another 42 as suspect. Rosenhan presented this as evidence that diagnostic judgments were essentially arbitrary in either direction.

The conclusion: psychiatric diagnosis was fundamentally unreliable. The labels assigned by hospitals did not correspond to anything observable in behavior. Once labeled, a person could not act their way out of the label. Mental hospitals, Rosenhan wrote, “cannot distinguish the sane from the insane.”

The paper landed at exactly the right cultural moment. Goffman’s Asylums had appeared a decade earlier. Thomas Szasz’s anti-psychiatry critique was in full circulation. Ken Kesey’s One Flew Over the Cuckoo’s Nest had become a Jack Nicholson film the year before. On Being Sane in Insane Places became the rare academic study that crossed fully into popular discourse. The American Psychiatric Association convened an emergency meeting within a month of publication. The deinstitutionalization push --- which would eventually shut down most state mental hospitals across the United States --- now had the empirical paper everyone could cite.

The Immediate Methodological Critique (Spitzer 1975)

Robert Spitzer, the psychiatrist who would soon lead the development of DSM-III, did not wait long to respond. In 1975 he published “On Pseudoscience in Science, Logic in Remission, and Psychiatric Diagnosis,” a 10-page evisceration in the Journal of Abnormal Psychology. Spitzer’s central argument has held up better than Rosenhan’s paper.

His core analogy, borrowed from psychiatrist Seymour Kety, was devastating in its simplicity: if a person “drank a quart of blood and, concealing what he had done, came to the emergency room of any hospital vomiting blood, the behavior of the staff would be quite predictable. If they labeled and treated him as having a bleeding peptic ulcer, I doubt that we could argue convincingly that medical science does not know how to diagnose that condition.” A pseudopatient who deliberately simulates a symptom and conceals the simulation is not a fair test of diagnostic accuracy. They are a test of whether physicians assume their patients are lying.

Spitzer also showed that the discharge diagnosis “schizophrenia in remission” was, in itself, a sign that the system had correctly identified that the pseudopatients were not currently ill. He surveyed the literature and found that “schizophrenia in remission” was a vanishingly rare discharge designation at the time --- used in roughly 1 in 200 cases at major hospitals. That the hospitals applied it to Rosenhan’s pseudopatients suggested that the psychiatrists noticed something was different about these patients. Rosenhan’s own data, in other words, partially refuted his interpretation.

Spitzer further pointed out that Rosenhan refused to release the underlying data, the hospital identities, or the specific pseudopatient records --- which prevented any independent replication or audit. This was not an unusual stance for psychology research in the 1970s. It became important in 2019.

These critiques were known to the field from the moment the paper was published. They did not slow its cultural momentum. The story was simply too good.

What Cahalan Found in 2019

Susannah Cahalan came to the Rosenhan story with unusual credentials. As a young reporter at the New York Post in 2009, she had been hospitalized for an extended psychotic episode that was eventually diagnosed as anti-NMDA-receptor encephalitis --- a treatable autoimmune disease that had nearly been mistaken for primary psychiatric illness. Her 2012 memoir Brain on Fire became a bestseller. Her interest in the Rosenhan paper was personal as well as professional: she wanted to understand what the diagnostic system had almost done to her, and Rosenhan was the canonical text on what that system was supposedly unable to see.

She spent years on it. Her access ultimately included Rosenhan’s personal papers, given to her by his executor; correspondence; his unpublished book manuscript on the study; and, crucially, his own hospital record from Haverford State Hospital in Pennsylvania, where he had checked himself in as the lead pseudopatient.

Four findings stand out.

First, she could only confidently identify two of the eight pseudopatients. One was Rosenhan himself, who had used the pseudonym “David Lurie.” The other was Bill Underwood, a Stanford graduate student. Despite years of searching --- archival work, interviews with Rosenhan’s family, contact with everyone in his academic orbit who might have been involved --- the other six pseudopatients could not be located, did not turn up in his papers, did not appear in any documentation, and have never come forward in the half-century since publication. Cahalan does not claim this proves they didn’t exist, but the inability to verify the existence of six of eight study participants in a famous study is a significant evidentiary problem.

Second, Rosenhan’s own hospital record sharply contradicted his published account. The paper claimed pseudopatients reported only the three-word auditory hallucination and were otherwise candid and normal. Cahalan obtained Rosenhan’s actual intake record. In it, he reported hearing radio signals, intercepting other people’s thoughts, putting copper pots over his ears to block the transmissions, months of suicidal ideation, an inability to work or sleep, mounting financial trouble, speech difficulties, twitching, and grimacing. Any one of these symptoms is, by itself, a more substantial reason to admit a patient and consider a schizophrenia-spectrum diagnosis than “empty, hollow, thud.” The paper described a minimal, controlled stimulus. The records describe a much more elaborate presentation that any responsible psychiatrist would take seriously.

Third, Bill Underwood’s account diverged from the published paper. Underwood spent eight days at a hospital with roughly 1,500 patients, not the seven days at a hospital with 8,000 patients that the published paper attributed to one of the pseudopatients. He also reported that Rosenhan had not carefully prepared the volunteers as the paper claimed --- just a brief conversation and basic instruction on how to hide pills in his cheek. Underwood, notably, defended much of his hospital experience to Cahalan: he believed Rosenhan’s overall conclusion captured something real about how hospitals dehumanized patients. But the factual details he gave did not line up with the published methodology.

Fourth, the Harry Lando case. Harry Lando was a Stanford graduate student who participated as the ninth pseudopatient. He was admitted to the US Public Health Service Hospital in San Francisco. According to Rosenhan’s own notes, found in his papers, Lando’s experience at the hospital was substantially positive. He found the staff caring. He found the structure helpful. Rosenhan’s notation about Lando, in capital letters in his own handwriting, was “HE LIKES IT!” Rosenhan subsequently dropped Lando from the published study on the grounds that Lando had “falsified aspects of his personal history.” This was, in context, a strange justification: Rosenhan had instructed all the pseudopatients to use false identities. The actual reason for the exclusion, Cahalan argues, is that Lando’s positive experience contradicted the narrative the paper was meant to deliver.

Cahalan also found, separately, that some of the specific incidents Lando reported in his hospital --- a particular interaction, a particular observation about staff --- reappeared in the published paper attributed to other, unidentified pseudopatients. This is not the kind of error a careful author makes by accident.

The picture that emerges, from records that nobody had bothered to look at for 46 years, is of a study whose published methodology and results do not match its own underlying documentation.

The Specific Inconsistencies

It’s worth stating cleanly which inconsistencies Cahalan documented and which she inferred, because the difference matters for how strongly one can characterize what happened.

Documented, with records:

Rosenhan reported far more elaborate symptoms at intake than the paper claimed any pseudopatient did. His hospital chart records sleep loss, suicidal ideation, sensory disturbances involving radio signals and copper-pot ear blockers, speech disturbances, and motor symptoms.
A quoted passage in the published paper, attributed to medical records, does not appear in the medical records Cahalan obtained. The chart contains nothing matching the quote.
Rosenhan’s papers contain notes on Harry Lando’s positive hospitalization experience, the words “HE LIKES IT!” in Rosenhan’s own hand, and the decision to exclude Lando from the published study.
Bill Underwood, the only other identifiable pseudopatient, gave Cahalan length-of-stay and hospital-size details that do not match the paper.
The promised writs of habeas corpus that the paper said pseudopatients filed --- as evidence that they tried to get themselves released --- do not appear to have been filed. Cahalan searched the records of the relevant courts.

Inferred from absence of evidence:

Six of the eight pseudopatients cannot be identified, located, or documented. Their existence is asserted in the paper but cannot be independently verified. Cahalan does not claim they were fabricated. She does point out that, given the lack of records, the question is legitimately open.
The second part of the study --- the “no pseudopatients sent” sting that supposedly produced 83 false-positive identifications at a teaching hospital --- has no surviving documentation in Rosenhan’s papers. The teaching hospital has never been identified.

The historian Andrew Scull, in his 2023 book Desperate Remedies, called the Rosenhan study a “successful scientific fraud.” That is stronger language than Cahalan herself uses. Cahalan tends to use words like “likely fabricated,” “substantially invented,” and “the evidence we have does not match what was published.” The narrower and better-documented claim is that the published paper does not accurately describe what the surviving records show. That is enough.

It is not enough to call Rosenhan a fraud in the colloquial criminal sense, and Cahalan does not. He died in 2012 and cannot defend himself. The record we have is the record. The record does not match the paper.

The Policy Consequences

The reason any of this matters --- the reason this is not merely a footnote to the history of social psychology --- is that this paper drove policy that affected millions of people.

Deinstitutionalization. The closing of state mental hospitals had been underway since the 1950s, driven by the introduction of antipsychotic medications, civil libertarian arguments about involuntary commitment, fiscal pressure on state budgets, and the federal Community Mental Health Act of 1963. Rosenhan’s paper did not start the movement. But it provided the canonical academic citation. Throughout the 1970s and 1980s, advocacy for hospital closures cited On Being Sane in Insane Places as proof that the hospitals could not even tell who was sick. The community-based services that were supposed to replace the hospitals, in many states, were never adequately funded. The result was the population we now describe euphemistically as the chronically homeless mentally ill, and the de facto transfer of severe mental-illness care to county jails and state prisons --- which today, by population, are the largest mental-health facilities in the country.

It is impossible to estimate cleanly how much of this outcome traces to Rosenhan’s paper, because deinstitutionalization had many fathers. The honest claim is that Rosenhan provided the cleanest academic warrant for the most aggressive version of the policy --- the version that emphasized hospitals as harmful rather than hospitals as in need of reform.

DSM-III. Robert Spitzer, who had eviscerated Rosenhan’s paper in 1975, led the development of DSM-III, published in 1980. DSM-III was a complete overhaul of psychiatric diagnosis, replacing Freudian and psychoanalytic categories with explicit, observable symptom checklists. Spitzer’s stated motivation for the rigor of the checklists was, in his own words, partly to make sure no future paper could ever produce a Rosenhan-style demonstration of unreliability. Every proposed diagnostic criterion was tested against the question of whether it would “pass the Rosenhan test.” DSM-III then shaped psychiatric diagnosis worldwide for the next forty years. So the paper that may have been substantially fabricated drove the architecture of the diagnostic manual that continues to govern how nearly all psychiatric diagnoses get made.

The anti-psychiatry movement. Rosenhan was not himself an anti-psychiatrist. But his paper became the most-cited academic source in anti-psychiatry literature. It legitimized public skepticism about involuntary commitment, electroconvulsive therapy, and psychiatric authority generally. Some of that skepticism produced real reform. Some of it produced lasting public distrust of mental-health treatment that may have prevented people from seeking care.

Textbook canonization. On Being Sane in Insane Places entered nearly every introductory psychology textbook published between 1975 and the late 2010s. Generations of undergraduates have been taught, as established fact, that this study demonstrated the unreliability of psychiatric diagnosis. The replacement of this teaching with a more accurate version --- “a famous study that, on later archival investigation, appears to be substantially inconsistent with its own underlying records” --- has barely begun.

What This Means for “Foundational” Behavioral Research

The disquieting fact about Rosenhan’s paper is not merely that it was probably wrong. It is that it took 46 years for anyone to seriously check.

Spitzer’s 1975 critique focused on the logic of the methodology --- the bleeding-ulcer analogy, the discharge-diagnosis evidence. He did not have access to the underlying records, and as far as we know, he did not try to interview the pseudopatients or audit the hospital data. Nor did anyone else. For nearly half a century, the field accepted the paper on the author’s representation that the data existed as described. There was no organized incentive, in academia, for someone to spend three to five years tracking down the documentary basis of a famous old study. Cahalan was a journalist, working on a book, with a personal stake in the topic. That was the social structure that finally produced the audit.

This is a structural feature of how social science works, and it is not unique to Rosenhan. Diederik Stapel published faked data for years before a graduate student finally questioned it. Brian Wansink’s food-psychology empire collapsed in 2017 because outside critics finally went through the actual data. The Stanford Prison Experiment was definitively re-examined in 2019, almost five decades after the fact, by a French researcher who got into the archives. Daryl Bem’s “feeling the future” paper sat unaudited until skeptics tried to replicate it. In each case, the gap between publication and serious documentary verification was decades.

There is no working mechanism in the social sciences to systematically re-audit famous historical studies. The replication crisis has produced large-scale replication projects (the Open Science Collaboration’s 2015 Reproducibility Project being the most prominent), but those replications operate on new samples with the published methodology. They cannot detect a study whose published methodology was different from what actually happened. Detecting that kind of problem requires going back to original records, original participants, original notes --- the kind of work Cahalan did for the Rosenhan study and that almost nobody else ever does for almost any other study.

For a strategist or an evaluator of behavioral-science evidence, the implication is uncomfortable. The fact that a study is famous does not mean it has been thoroughly checked. The fact that it has been cited 10,000 times does not mean the citations represent independent verification. It usually means they represent the original publication being passed along, again and again, by people who never had reason to inspect the underlying records and would have had no easy way to do so even if they wanted to.

What’s Honest to Say About Psychiatric Diagnosis Now

It would be wrong to take from this story the conclusion that psychiatric diagnosis is reliable, or that the original Rosenhan paper was simply wrong about everything, or that anti-psychiatry critiques are all baseless. None of those things are true.

The reliability of psychiatric diagnosis --- how well two clinicians agree on a diagnosis given the same patient --- is a legitimately mixed picture. Modern studies measuring inter-rater reliability of DSM-5 categories, including the field trials that preceded its publication in 2013, found Kappa coefficients (a standard reliability measure) that varied widely. Major depressive disorder, for example, came in at Kappa around 0.28 --- well below what most psychometricians would consider adequate agreement. Generalized anxiety disorder came in similarly. Some categories, like autism spectrum disorder, did better. Some, like mixed anxiety-depressive, did worse. These are real reliability problems with real consequences for patients.

But none of these results require Rosenhan’s study. The empirical case that diagnostic reliability is uneven across DSM categories rests on modern, published, replicated field trials --- not on a possibly-fabricated 1973 demonstration. The honest path forward is to cite the modern reliability literature and to retire Rosenhan as evidence.

Similarly, the institutional critique of mid-century mental hospitals --- the Goffman / Asylums tradition --- contains genuinely valuable observations that do not depend on Rosenhan’s paper being true. Hospitals did dehumanize patients. Diagnostic labels did distort how patients were perceived. Power asymmetries inside institutions were and are real. These are well-documented in the qualitative ethnographic literature and do not need Rosenhan’s pseudopatients to support them.

The version of the field that is now emerging in the wake of Cahalan’s investigation is something like: Rosenhan was directionally onto a real set of concerns about psychiatric institutions and diagnostic categories, but the specific empirical claims in his paper appear to be substantially invented, and the modern evidence base for those concerns is independent of his work. That is a clean and honest position.

What This Means for Strategists Evaluating Behavioral Science Underlying Major Policy

For anyone whose decisions depend on behavioral-science findings --- whether you’re a corporate strategist, a consultant, a founder, a policymaker, or just someone reading a TED talk --- the Rosenhan story sharpens three concrete calibration questions.

Has the documentary basis of the study been independently verified? Most famous behavioral science findings rest entirely on the author’s representation that the data was collected the way the paper describes. Independent verification --- in the sense of someone outside the original lab going through the actual records --- is unusual. When you cite a famous study to support a strategy decision, ask whether anyone has audited its underlying records. For most studies, the answer is no. That does not mean they are wrong. It does mean your confidence interval on them should be wider than the typical citation implies.

Have the participants been independently traced? If a study was a famous demonstration with a small number of participants, has anyone followed up with the participants to confirm their accounts? Doug Korpi from the Stanford Prison Experiment told reporters in 2018, half a century later, that he had faked his “psychotic breakdown” because he wanted to leave the study. Harry Lando, alive at the time of Cahalan’s investigation, told her his hospitalization had been positive. The participants, when interviewed, often complicate the story in important ways. For most historic studies, this kind of participant follow-up was never done.

Does the cultural momentum match the evidential strength? When a study is famous, the cultural momentum behind it produces a strong illusion of evidence. Everyone cites it; therefore it must be true. The actual evidence is usually a single paper, sometimes two, often with small samples and rarely with independent audit. The gap between how widely cited a finding is and how well-evidenced it is can be large. The Stanford Prison Experiment was n=24 and unreplicated in its strong form. Rosenhan was n=8, almost none verifiable. Power posing was effect-size-zero on replication. The marshmallow test loses most of its predictive power once you control for socioeconomic background. The pattern is consistent enough that an explicit rule is justified: when a behavioral finding has become culturally famous, the prior on its strict factual claims should go down, not up.

The strategist’s job is not to know the latest behavioral science. It is to know which evidence claims have earned the right to influence a decision and which haven’t. The Rosenhan study --- whatever its place in the history of ideas --- has not earned that right, and the version of it that lives in textbooks and policy memos has not earned it either.

Sources

[Rosenhan, D. L. (1973). On being sane in insane places. Science, 179(4070), 250-258. DOI: 10.1126/science.179.4070.250](https://www.science.org/doi/10.1126/science.179.4070.250) --- the original paper.
Cahalan, S. (2019). The Great Pretender: The Undercover Mission That Changed Our Understanding of Madness. Grand Central Publishing. ISBN 978-1-5387-1528-4 --- the multi-year investigative book.
[Spitzer, R. L. (1975). On pseudoscience in science, logic in remission, and psychiatric diagnosis: A critique of Rosenhan’s “On being sane in insane places.” Journal of Abnormal Psychology, 84(5), 442-452. DOI: 10.1037/h0077124](https://pubmed.ncbi.nlm.nih.gov/1194504/) --- the contemporary methodological critique.
Spitzer, R. L. (1976). More on pseudoscience in science and the case for psychiatric diagnosis. Archives of General Psychiatry, 33(4), 459-470. --- Spitzer’s follow-up critique.
Carey, B. (2019, November 4). New Book Casts Doubt on Famous Study That Shifted Care for the Mentally Ill. The New York Times. --- the contemporary mainstream coverage of Cahalan’s findings.
Scull, A. (2023). Desperate Remedies: Psychiatry’s Turbulent Quest to Cure Mental Illness. Belknap Press / Harvard University Press --- contextualizes Rosenhan’s policy impact.
Science History Institute, Distillations podcast: “The Fraud That Transformed Psychiatry” --- documents DSM-III’s “Rosenhan test” framing.
WHYY, “Susannah Cahalan’s search for pseudo patients from the Rosenhan Experiment” --- interview-based summary of Cahalan’s findings.

This article is part of an ongoing series on famous behavioral science studies that did not survive scrutiny. Other entries cover the Stanford Prison Experiment, Diederik Stapel’s fraud, Milgram’s obedience experiments, and Daryl Bem’s precognition paper. The full hub lives at /replication-crisis/.

If you’re building organizational, policy, or product strategy on behavioral-science assumptions and want a careful audit of which of those assumptions still hold up, book an evidence review.

FAQ

Was Rosenhan committing fraud? The Great Pretender documents a substantial gap between the published paper and Rosenhan’s own surviving records, including symptoms reported at intake, a quoted passage that does not appear in his chart, the exclusion of a pseudopatient whose positive hospital experience contradicted the narrative, and six pseudopatients who cannot be independently identified. Susannah Cahalan uses careful language --- “likely fabricated,” “substantially invented” --- rather than fraud in the criminal sense. The historian Andrew Scull, in Desperate Remedies (2023), uses stronger language, calling it a “successful scientific fraud.” Rosenhan died in 2012 and cannot defend himself. The defensible factual claim is that the published paper does not accurately describe what the surviving records show.

Was psychiatric diagnosis in 1973 actually reliable? Probably not, by modern standards. Pre-DSM-III psychiatric diagnosis depended heavily on Freudian and psychoanalytic interpretation, and inter-rater reliability studies from the 1960s and early 1970s found genuinely poor agreement between clinicians. The legitimate concerns about diagnostic reliability that Rosenhan’s paper appealed to were real. They simply did not need his (apparently fabricated) demonstration to establish them. The reform of psychiatric diagnosis via DSM-III addressed these concerns through explicit symptom checklists --- which substantially improved reliability in some categories and less so in others.

What about modern diagnostic reliability? Mixed. The DSM-5 field trials, published in 2013, found Kappa coefficients ranging from very poor (major depressive disorder around 0.28; generalized anxiety disorder similar) to acceptable (autism spectrum disorder around 0.69). Reliability of major mental-illness diagnosis remains a real, well-documented concern in the psychiatric literature. It is now studied with modern methodology rather than via pseudopatient demonstrations.

Does Cahalan have an agenda? Cahalan is a journalist and a former psychiatric patient. She has been transparent about her personal stake: her own 2009 anti-NMDA-receptor encephalitis case was almost diagnosed as primary psychiatric illness. Her interest in the Rosenhan paper is partly the diagnostic question and partly a journalistic one. Her book has been broadly well-received in both the popular and academic press. The Psychiatric Times review was positive. Reviewers have noted that her evidentiary case is strongest on the documented inconsistencies and weakest on her inferences about the six unidentified pseudopatients --- a distinction Cahalan herself maintains. Her findings have not been seriously rebutted by anyone with access to the same records.

How did the study survive scrutiny for 46 years? The structural answer is that academic incentives do not reward going back to verify the documentary basis of old famous studies. Spitzer’s 1975 critique attacked the logic of the methodology, not the underlying records, because Rosenhan did not release them. Nobody, until Cahalan, was sufficiently motivated to spend years trying to find the pseudopatients, obtain the hospital records, and audit the paper’s specific claims. This is a general feature of social science: published methodology is usually taken on faith, and the post-publication audit ecosystem barely exists.

Did the deinstitutionalization movement actually depend on Rosenhan? Not exclusively. Deinstitutionalization was underway before Rosenhan and had many causes, including antipsychotic medications, civil libertarian critiques of involuntary commitment, federal community mental health policy, and state fiscal pressures. Rosenhan provided the most-cited academic warrant for the most aggressive version of the policy. It is not possible to estimate cleanly what fraction of the eventual policy outcome traces to his paper. It is fair to say his paper was the empirical foundation many advocates and policymakers cited.

Does this mean psychiatric medication and treatment don’t work? No. The efficacy of psychiatric treatment --- medications, psychotherapy, and combinations --- is an empirical question studied in modern randomized controlled trials. The honest summary of that literature is that mental-health treatment, like most medicine, works for some people, partially for others, and not at all for some. The Rosenhan study has nothing to say about treatment efficacy, only about diagnostic reliability. Conflating the two --- which the anti-psychiatry movement often did --- is a category error.

What should we teach about Rosenhan now? A reasonable position is: teach the paper, teach its policy impact, teach Spitzer’s 1975 critique, teach Cahalan’s 2019 investigation, and teach the gap between published claims and documentary evidence as a case study in why post-publication audit matters. This is more useful than either continuing to teach the original paper as fact or simply removing it from the curriculum. The story of how a possibly-fabricated paper drove half a century of policy is more pedagogically important than the paper itself.

replication-crisis rosenhan-1973 psychiatry anti-psychiatry evidence-evaluation

Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified behavioral economist (1 of ~1,000 worldwide). 200+ A/B tests across energy, SaaS, fintech, e-commerce, and marketplace verticals.

About LinkedIn Newsletter

Rosenhan's "On Being Sane in Insane Places": The Foundation Study Cahalan Showed Was Largely Fabricated

What Rosenhan 1973 Claimed

The Immediate Methodological Critique (Spitzer 1975)

What Cahalan Found in 2019

The Specific Inconsistencies

The Policy Consequences

What This Means for “Foundational” Behavioral Research

What’s Honest to Say About Psychiatric Diagnosis Now

What This Means for Strategists Evaluating Behavioral Science Underlying Major Policy

Sources

FAQ

Three places this work shows up.

GrowthLayer

Consulting

Jobsolv

Get the Weekly
Experimentation Playbook

What Rosenhan 1973 Claimed

The Immediate Methodological Critique (Spitzer 1975)

What Cahalan Found in 2019

The Specific Inconsistencies

The Policy Consequences

What This Means for “Foundational” Behavioral Research

What’s Honest to Say About Psychiatric Diagnosis Now

What This Means for Strategists Evaluating Behavioral Science Underlying Major Policy

Sources

Related: Other Studies in This Series

FAQ

Related Articles

Cohen's d And The Misuse Of "Small/Medium/Large" Effect Sizes

The False Consensus Effect: Why You Think Everyone Agrees With You

The Barnum/Forer Effect: Why Personality Tests And Horoscopes Feel So Accurate

Three places this work shows up.

GrowthLayer

Consulting

Jobsolv

Get the WeeklyExperimentation Playbook

Get the Weekly
Experimentation Playbook