Acupuncture Sham-Trial Evidence: Real Effect, But Not From Where You'd Think

Atticus Li

← The Replication Crisis · replication-crisis

Acupuncture Sham-Trial Evidence: Real Effect, But Not From Where You'd Think

Decades of acupuncture trials with sham-needle controls show a consistent pattern: real acupuncture beats no-treatment, but barely beats sham. Most of the apparent benefit is the therapeutic ritual, not the meridian-point mechanism. The lesson generalizes to almost any alternative-medicine evaluation.

By Atticus Li May 26, 2026 31 min read

Acupuncture is the rare alternative-medicine modality that has been subjected to large-scale, well-funded, rigorous randomized controlled trials, including individual-patient-data meta-analyses on tens of thousands of subjects. The result of this thirty-year evidence accumulation is something more interesting and more uncomfortable than either the credulous “acupuncture works” or the dismissive “acupuncture is fake” position. The evidence pattern, repeated across condition after condition and meta-analysis after meta-analysis, is this. Real acupuncture, with traditional needles inserted into traditional meridian points, performs better than no treatment at all. Real acupuncture also performs slightly better than sham acupuncture, with statistically significant but typically small effects, primarily for subjective chronic-pain conditions like low back pain, neck pain, knee osteoarthritis, and migraine prevention. But the difference between real acupuncture and sham acupuncture --- needles in the “wrong” locations, retractable theatrical needles that never penetrate skin, or simulated needling with toothpicks --- is small and frequently within the range that can be explained by patient blinding failures or practitioner enthusiasm rather than by anything specific to the meridian theory that supposedly justifies the intervention.

The conclusion that the most rigorous evidence supports is therefore not what either the proponents or the strongest critics say. The proponents say acupuncture works, by which they mean the specific traditional mechanism --- the redirection of Qi through meridian channels --- is therapeutically active. The evidence does not support this. The critics say acupuncture is a complete placebo, by which they mean it has no effect beyond expectation. The evidence does not quite support this either; the effect, especially for chronic pain, is small but reproducibly larger than no treatment. What the evidence supports is the position that David Colquhoun and Steven Novella articulated in a 2013 Anesthesia & Analgesia editorial: acupuncture is theatrical placebo. The therapeutic ritual --- the consultation, the attention, the touch, the calm room, the patient expectation, the regression-to-the-mean inherent in seeking treatment when pain is at its worst --- produces a small but real effect on subjective chronic-pain reports. The specific needle locations and the specific mechanism do not appear to matter. The evidence sustains the ritual; it does not sustain the meridian theory.

This pattern --- real effect, wrong mechanism --- is more strategically interesting than either of the simpler conclusions, and it generalizes well beyond acupuncture itself. For anyone evaluating an alternative-medicine claim, a wellness intervention, a brand-name therapy, or any procedure whose theoretical justification is weak but whose patient testimonials are strong, the acupuncture evidence base is the clearest demonstration available of why the right comparator is not “no treatment” but “the most plausible sham of the treatment.” This article walks through the major sham-controlled acupuncture trials and meta-analyses, the Madsen 2009 BMJ paper that crystallized the pattern, the Vickers 2012 and Vickers 2018 individual-patient-data meta-analyses that refined the pain-specific findings, the Colquhoun-Novella theatrical-placebo framework, and what a strategist evaluating a wellness or healthcare product should take from all of it.

The Sham Acupuncture Methodology

A randomized controlled trial of any drug typically has two conditions: the active drug and an inert placebo pill that looks and tastes identical. Both patient and investigator are blinded to which is which. This blinding is what allows the trial to distinguish the pharmacological effect of the drug from the expectation effect of swallowing something one believes might help. Acupuncture trials face a harder version of this design problem. A needle is harder to fake than a pill. The patient typically feels the needle. The practitioner cannot be blinded to the needle location or the depth of insertion. The setting is more theatrical than a clinic visit. Decades of methodological work has nonetheless produced a family of sham techniques that allow at least patient blinding and, in some designs, partial practitioner blinding.

The first family of sham techniques is wrong-location acupuncture, sometimes called sham-point or non-meridian acupuncture. Needles are inserted at the correct depth and with the correct technique, but at locations that are not traditional acupuncture points. From the patient’s perspective the experience is similar to real acupuncture. From the perspective of the traditional theory --- which claims that the specific point matters --- this should not work. If the meridian theory is correct, wrong-location needling should perform substantially worse than correct-point needling. Across the major trials, this prediction has not been borne out. Wrong-location needling typically performs almost identically to correct-point needling, with small and frequently non-significant differences. This is the first major piece of evidence against the meridian-specificity claim.

The second family is superficial or minimal acupuncture. Needles are inserted but only superficially, well above the depth that traditional technique specifies. Some practitioners do not consider this a true sham because it still involves needle insertion. Trials using this design typically find that minimal acupuncture performs similarly to full acupuncture, again with small differences.

The third family is non-penetrating sham acupuncture, of which the Streitberger needle is the canonical instrument. The Streitberger needle, developed by Konrad Streitberger in the late 1990s, looks identical to a regular acupuncture needle but has a blunt tip that retracts into the handle when pressed against the skin, producing the sensation of insertion without actually penetrating. With careful technique, patients cannot reliably distinguish a Streitberger needle from a real one. This was a methodological breakthrough that allowed proper patient blinding for the first time. In trials using Streitberger or similar retractable sham needles, the difference between real and sham acupuncture has typically been even smaller than in wrong-location designs, suggesting that even the act of skin penetration may not be doing much of the work.

The fourth family is the toothpick or simulated needling sham. The practitioner taps the skin with a toothpick housed in a guide tube, producing a sensation similar to needle insertion. In Daniel Cherkin’s large 2009 Archives of Internal Medicine trial of acupuncture for chronic low back pain --- which used simulated needling as one of the comparator arms --- there was no statistically significant difference between real acupuncture, individualized acupuncture, standardized acupuncture, and simulated needling on the primary outcome of back pain at 8 weeks. All four arms outperformed usual care. This is a particularly clean example of the pattern that the more recent and methodologically rigorous trials have repeatedly produced: the acupuncture ritual works better than nothing; the specific needling mechanism does not appear to be what is doing the work.

The methodological history of the sham acupuncture trial is important for one reason. The proponents of acupuncture have, at various points, argued that the absence of a real-versus-sham difference is evidence that sham acupuncture itself is therapeutically active --- that all needling, even superficial or non-penetrating needling, produces real physiological effects through some non-meridian mechanism. This is a logically possible position but it requires its own evidence base, which has not been developed to anything like the standard of the original meridian theory. It also requires the proponents to abandon the meridian theory in everything but name. If toothpick taps work as well as needling, the meridian theory has failed; whatever is producing the effect is something other than the redirection of Qi through specific points.

Madsen 2009: The BMJ Systematic Review That Crystallized The Pattern

In January 2009, Matias Vested Madsen, Peter Gøtzsche, and Asbjørn Hróbjartsson published a paper in the BMJ titled “Acupuncture treatment for pain: systematic review of randomised clinical trials with acupuncture, placebo acupuncture, and no acupuncture groups.” Gøtzsche and Hróbjartsson were both at the Nordic Cochrane Centre in Copenhagen, the institution responsible for some of the most rigorous and skeptical placebo and meta-analytic work in modern medicine. The Madsen paper was a deliberate attempt to formalize the three-arm comparison that the patchwork of prior acupuncture trials had implicitly provided.

The methodological insight of the Madsen paper was structural. To answer the question “what is acupuncture doing?” you need to be able to separate three effects: the effect of the specific intervention versus the effect of the placebo procedure, and the effect of the placebo procedure versus the effect of receiving no treatment at all. The first comparison --- real acupuncture versus sham acupuncture --- isolates the specific contribution of the meridian-and-needle mechanism, on the assumption that the sham reproduces the placebo experience as faithfully as possible. The second comparison --- sham acupuncture versus no acupuncture --- isolates the contribution of the ritual itself, the consultation and attention and expectation that any therapeutic encounter provides. The third comparison --- real versus no treatment --- combines both effects and is the comparison that most clinical trials default to, but it tells you nothing about which of the two components is responsible for the result.

Madsen and colleagues identified 13 randomized trials that included all three arms --- real acupuncture, placebo acupuncture, and no acupuncture --- for pain conditions. They pooled the results across trials, separately computing the standardized mean difference for the real-versus-placebo comparison and for the placebo-versus-no-treatment comparison. The result was the cleanest available decomposition of where acupuncture’s effect comes from. The standardized mean difference for placebo acupuncture versus no acupuncture was approximately 0.42, a moderate effect, meaning that going through the ritual of receiving fake acupuncture produced a clinically meaningful improvement in pain compared with receiving no treatment at all. The standardized mean difference for real acupuncture versus placebo acupuncture was approximately 0.17, a small effect, meaning that the specific contribution of real needles in traditional locations was about a quarter the size of the ritual effect. The combined real-versus-no-treatment effect was therefore largely accounted for by the placebo-ritual component, with the meridian-specific component making only a small additional contribution.

The Madsen team’s interpretation, expressed in the paper’s discussion, was deliberately understated and methodologically careful. They did not claim that acupuncture does not work. They claimed that the available evidence supported a small specific effect of acupuncture on pain that was so small as to be of questionable clinical importance, and that the larger portion of acupuncture’s apparent benefit was attributable to the placebo response generated by the ritual. The clinical implication was not that practitioners should stop offering acupuncture --- a small effect added to a large placebo effect is still a clinical benefit that patients experience --- but that the theoretical claims of traditional acupuncture, about specific meridian points and specific Qi mechanisms, were not supported by the pattern of results.

The reception of the paper was, predictably, polarized. Skeptics treated it as decisive evidence that acupuncture’s mechanism was a placebo. Proponents argued that the included trials had methodological problems, that the sham comparator was itself a partial intervention, and that pooled analyses across heterogeneous pain conditions obscured real effects in specific conditions. Both sides had partly valid points. The Madsen paper became, over the following decade, the canonical citation for the proposition that the bulk of acupuncture’s effect is mediated by the therapeutic ritual rather than the specific intervention.

Vickers 2012 And Vickers 2018: The Individual Patient Data Meta-Analyses

The Madsen approach to meta-analysis is the standard approach: pool published summary statistics across trials. A more powerful approach is to pool the underlying patient-level data, when it can be obtained from the original trialists. This is called individual patient data (IPD) meta-analysis, and it allows much finer analysis than summary-level pooling: you can adjust for baseline characteristics, examine subgroup effects with proper statistical power, and harmonize outcome measures across trials. The downside is that IPD meta-analysis is enormously labor-intensive and requires the cooperation of every original trialist whose data you want.

In October 2012, Andrew Vickers and colleagues at Memorial Sloan-Kettering Cancer Center published a paper in Archives of Internal Medicine titled “Acupuncture for chronic pain: individual patient data meta-analysis.” The Vickers team had spent years assembling the patient-level data from 29 high-quality randomized trials of acupuncture for chronic pain --- 17,922 patients in total --- across four pain conditions: chronic back and neck pain, osteoarthritis, chronic headache, and shoulder pain. The trial set was restricted to studies meeting strict methodological standards, particularly studies that had compared acupuncture both to sham acupuncture and to a no-acupuncture control. This was, at the time of publication, the largest and most rigorous meta-analysis of acupuncture for pain ever conducted.

The Vickers 2012 findings were more favorable to acupuncture than the Madsen 2009 findings, but the underlying pattern was the same. Acupuncture was statistically significantly more effective than sham acupuncture across all four pain conditions, with standardized mean differences ranging from approximately 0.15 to 0.23 --- small effects, but reliably detectable in a dataset of nearly 18,000 patients. Acupuncture was more substantially more effective than no acupuncture control, with standardized mean differences in the range of 0.43 to 0.55 --- moderate effects. The arithmetic of the comparison reproduces the Madsen pattern: about three-quarters of the effect of real acupuncture versus no treatment is accounted for by the sham-versus-no-treatment ritual component, with about one-quarter attributable to whatever specific mechanism real acupuncture adds.

The Vickers team’s interpretation was more positively framed than Madsen’s, in part because Vickers and colleagues emphasized the statistical robustness of the small real-versus-sham effect across conditions. Their reading was that the consistent, statistically significant, small advantage of real over sham was evidence that acupuncture provides a specific therapeutic effect over and above placebo, and that this specific effect is clinically meaningful when added to the placebo effect that the ritual itself produces. This interpretation was disputed in the editorial and letters that followed the paper. The most pointed critique was that the small magnitude of the real-versus-sham difference --- in some conditions in the range of 0.15 standardized mean differences --- was close to the magnitude that could plausibly be produced by imperfect patient blinding alone. If even a small fraction of patients can guess which arm they are in, and the guess correlates with expectation, the resulting bias inflates the real-versus-sham difference in the direction the trial finds.

The Vickers team revisited the analysis in 2018 in a paper published in the Journal of Pain titled “Acupuncture for chronic pain: update of an individual patient data meta-analysis.” The update added 10 more trials and brought the patient total to 20,827 across the same four pain conditions. The headline finding was reaffirmed: acupuncture remained statistically significantly more effective than sham control, with standardized mean differences in approximately the same range as the 2012 analysis. The team also documented that the treatment effect persisted at 12 months in trials that had measured long-term follow-up --- the effect was not just an acute placebo response that dissipated rapidly. This long-term-persistence finding was interpreted by the proponents as evidence that the effect must be more than a placebo response; the critical response was that long-term persistence is also predicted by the regression-to-the-mean pattern characteristic of chronic-pain conditions, which fluctuate substantially and are most likely to be seeking treatment when at their worst.

The Vickers IPD analyses are the strongest evidence available for the proposition that acupuncture has a small but real specific effect on chronic pain beyond placebo. They are not without methodological controversy --- the patient-blinding question, the magnitude question, the placebo-comparator-validity question --- but they are the cleanest evidence base on the proponent side. What they do not support, in any of the analyses, is the proposition that acupuncture’s effect is large. They do not support the proposition that the meridian-and-point theory of traditional acupuncture is mechanistically correct. And they do not support the proposition that acupuncture has effects on non-pain conditions where the trial evidence is much thinner.

Colquhoun And Novella: The “Theatrical Placebo” Framework

In June 2013, the journal Anesthesia & Analgesia published an editorial titled “Acupuncture is theatrical placebo” by David Colquhoun, a professor of pharmacology at University College London, and Steven Novella, a clinical neurologist at Yale and a prominent science communicator. The editorial was a response to a special issue of the journal on acupuncture and was an unusually pointed intervention in what had until then been a more polite scientific debate. The Colquhoun-Novella editorial is the clearest statement available of the case that the cumulative sham-controlled evidence is best interpreted as theatrical placebo rather than as evidence of a small specific effect.

The framework Colquhoun and Novella developed has three components. First, the cumulative real-versus-sham evidence base, including the Vickers analyses, shows differences that are small enough to be plausibly explained by patient unblinding bias and practitioner enthusiasm rather than by a specific mechanism. The 0.15-to-0.20 standardized mean difference range is within the magnitude that can be produced by bias in trials where patient blinding cannot be perfect. Second, the sham-versus-no-treatment difference is moderate and reproducible, indicating that the therapeutic ritual is producing a real psychological effect on subjective pain. This is not surprising and is consistent with decades of placebo research on expectation, attention, and patient-practitioner interaction. Third, the specific mechanisms claimed by traditional acupuncture --- meridians, points, Qi --- are not necessary to explain any of the observed effects. A simpler model in which the entire effect comes from the placebo response generated by an elaborate therapeutic ritual is sufficient.

The “theatrical placebo” label is not dismissive of placebo effects. Placebo responses for chronic-pain conditions are well-documented in the broader literature and can be substantial. The framework is dismissive specifically of the mechanism. The acupuncture ritual is therapeutic in roughly the way that any elaborate, attentive, ritualized therapeutic encounter is therapeutic. The needles are stage props. The meridian map is a story that the patient and practitioner share that helps the placebo response take effect. This framework has the explanatory virtue of accounting for every piece of the evidence base --- the moderate sham-versus-no-treatment effect, the small real-versus-sham effect, the failure of wrong-location and minimal-needling controls to produce reduced effects --- without requiring the existence of biological mechanisms that do not appear in any other domain of biology.

The Colquhoun-Novella framework has its own critics, particularly among acupuncture researchers who argue that the framework underweights the Vickers IPD evidence and overweights the patient-blinding objection. The debate has not been resolved at the level of the high-quality trials. What the framework has done is provide a coherent interpretive frame for the otherwise puzzling pattern of the evidence. Without that frame, the pattern --- works better than nothing, barely better than sham, mechanism does not seem to matter --- is difficult to make sense of in the conventional terms of acupuncture proponents or strict skeptics. With the frame, the pattern is exactly what theatrical placebo predicts.

What Survives And What Does Not: The Cochrane Picture

The Cochrane Collaboration produces what is generally regarded as the most rigorous systematic-review evidence in medicine. Cochrane reviews of acupuncture for various conditions provide the most comprehensive picture of which acupuncture applications have evidentiary support and which do not. The pattern across the Cochrane reviews is consistent with the Madsen and Vickers findings: small benefits for some chronic-pain conditions, no benefits for many other conditions, and never the dramatic effects that traditional acupuncture proponents have claimed.

The 2016 Cochrane review by Klaus Linde and colleagues, “Acupuncture for the prevention of episodic migraine,” is one of the more positive Cochrane assessments. It concluded that acupuncture reduced the frequency of episodic migraines compared with no acupuncture and compared with prophylactic drug treatment, with a small advantage of real over sham acupuncture. This is broadly the picture that the Vickers IPD analyses produce for chronic headache. Migraine prevention is therefore one of the better-supported acupuncture indications.

Other Cochrane reviews have been less positive. The 2020 Cochrane review on acupuncture for chronic non-specific low back pain found that real acupuncture probably reduces pain in the short term compared with sham acupuncture, but the effect was small and the certainty of evidence was low. The 2018 Cochrane review on acupuncture for primary dysmenorrhea found uncertain evidence. The Cochrane review on acupuncture for smoking cessation found no consistent evidence of effectiveness. Cochrane reviews on acupuncture for stroke rehabilitation, IVF outcomes, depression in pregnancy, and several other non-pain conditions have generally found insufficient or low-quality evidence to support effectiveness.

The Cochrane pattern, taken in aggregate, supports a sharply restricted view of where acupuncture has evidentiary support. The cases that survive the highest evidentiary scrutiny are subjective chronic-pain conditions, where the effect is small but reproducible. Migraine prevention is the strongest case. Chronic back pain, neck pain, and knee osteoarthritis are similar but with smaller effects. The cases that do not survive are essentially everything else --- the broader range of conditions for which traditional acupuncture is claimed to be effective and for which patients are often referred. The contrast between what the strongest evidence supports and what the broader practice of acupuncture claims is large, and the gap is roughly the gap between “small specific effect on subjective chronic pain plus a substantial placebo-ritual effect” and “Qi-mediated systemic healing across the full range of bodily complaints.”

The Implication For Traditional Meridian-Point Claims

The traditional theory of acupuncture, as it has been transmitted from classical Chinese medical texts and reinterpreted in modern practice, makes specific claims that the contemporary sham-trial evidence base does not support. The traditional theory claims that there are specific energetic channels --- meridians --- through which Qi flows, that there are specific points along those meridians where needling can redirect or rebalance the Qi flow, and that the choice of point matters for the therapeutic effect. If this theory is correct, then needling at non-meridian locations should be substantially less effective than needling at the correct meridian points. Across the trial literature, this prediction has consistently failed. Wrong-location sham acupuncture performs nearly identically to correct-point acupuncture. The differences that are detected are too small to be consistent with a mechanism that depends on point specificity.

The Streitberger and toothpick sham trials make this even sharper. If the mechanism depends on needle insertion --- if the needle needs to penetrate the skin and reach a specific depth in a specific point --- then non-penetrating sham should fail entirely. It does not; it performs nearly as well as real needling. If the mechanism depends on the elaborate procedure but not the specific physical act of needling, then the most parsimonious explanation is that the procedure is the ritual itself, with the specific physical manipulation contributing little or nothing.

This does not mean that the traditional acupuncture practitioner is providing nothing of value. The practitioner is providing the ritual, the attention, the consultation, the framework within which the patient interprets their pain experience, and the regression-to-the-mean opportunity that any therapeutic encounter at the worst moment of a chronic condition provides. These are real psychological inputs that produce real subjective improvements. The practitioner is also, in many cases, providing the listening and attention that patients are not getting elsewhere in their medical care. None of this requires the meridian theory to be correct. It requires the ritual to be performed convincingly enough that the patient can engage with it. The theory functions as the narrative scaffold that holds the ritual together. The scaffold does not need to be literally true to do its work.

The honest conclusion, which neither the proponents nor the skeptics have fully made comfortable, is that traditional acupuncture is a moderately effective placebo-delivery system for chronic subjective pain. Its effectiveness is real, in the sense that patients reliably experience smaller pain on real outcome measures. Its mechanism is not what the theory claims. The point-specificity claim has been tested and failed. The Qi claim is unfalsifiable in the way that most metaphysical mechanisms are. The clinical recommendation that follows from the evidence is that acupuncture is reasonable for a patient with chronic pain who has not responded to other interventions, that patients should be told the realistic magnitude of the expected effect, and that the most important thing the practitioner is providing is the ritualized therapeutic encounter rather than the needling technique.

The General Pattern: Compare Against Sham, Not No-Treatment

The strategic lesson of the acupuncture evidence base generalizes well beyond acupuncture itself. The lesson is structural. For any intervention --- alternative medicine, wellness product, mental-health app, dietary supplement, productivity intervention, behavioral nudge --- the headline question of “does it work?” decomposes into two separate questions whose answers are often very different. Does it work better than nothing? And does it work better than the best plausible sham of itself? The first question is almost always easier to answer in the affirmative than the second. The first question is what most product comparison studies and most popular-press summaries actually measure. The second question is what would tell you whether the specific mechanism the intervention claims is doing the work.

This applies particularly to interventions where the user’s expectation and the ritual of the intervention are themselves substantial inputs into the outcome. Almost any meditation app outperforms doing nothing for 10 minutes a day. The right comparison is the meditation app versus a sham app that does something similar but without the specific technique claim. Almost any productivity intervention --- a new planner, a habit tracker, a focus app --- outperforms the prior baseline because the user is paying attention to the problem. The right comparison is the new intervention versus an attention-matched control. Almost any nutritional supplement that is associated with improved health outcomes in observational studies fails to outperform placebo when subjected to a proper randomized controlled trial. The pattern is consistent across health and wellness interventions broadly: effects shrink dramatically when the comparator is upgraded from “nothing” to “active sham.”

For a strategist evaluating a wellness or healthcare product, the operational rule that follows is this. When a product cites a study showing that it works, look immediately at what the comparator was. If the comparator was a waitlist, a no-treatment control, or “treatment as usual,” the study is not telling you much about the specific mechanism the product is claiming. If the comparator was an active sham --- an attention-matched alternative, a credible placebo, a similar-looking app with different content --- and the product still outperformed, then the evidence is more informative about the specific mechanism. The acupuncture literature is the cleanest case of a field where the no-treatment comparison shows a moderate effect and the sham comparison shows a much smaller effect. The same decomposition is available, when the trials exist, for almost any wellness intervention. The decomposition is almost never made in the marketing materials.

The strategist’s diagnostic question becomes: what would the strongest possible sham of this product look like, and has it been tested against that? A meditation app that has only been compared with a waitlist control is not yet evidence-based; it is evidence-friendly. A new ergonomic chair that has only been compared with no chair change is not yet evidence-based. A supplement that has only been compared with no supplement is not yet evidence-based. The number of wellness and healthcare products whose evidence base survives an upgrade from no-treatment to sham control is small. Most of the effect, in most cases, comes from the ritual, the attention, the expectation, and the regression-to-the-mean that any deliberate intervention into a fluctuating subjective condition is going to produce.

Strategist Takeaway

For anyone evaluating, building, or selling a wellness or healthcare product, the acupuncture evidence base offers three operationally useful rules. First, distinguish the question of whether your product works from the question of whether the specific mechanism you claim is what makes it work. These can have very different answers. A product that works through placebo, ritual, attention, or expectation is still a product that works; the people who use it are getting real benefit. But the mechanism claim is the part that determines competitive defensibility, scientific credibility, and regulatory exposure. A meditation app that produces real benefits because it gets users to sit quietly for ten minutes is doing something good; a meditation app that claims its proprietary breathing technique is the active ingredient is making a mechanistic claim that should be testable against a sham version that uses a different breathing technique. If the sham version performs equivalently, the mechanism claim is false and the marketing should not rest on it.

Second, when evaluating someone else’s product or evidence base, look at what the comparator in the cited studies was. The number of products whose efficacy disappears when the comparator is upgraded from “waitlist” to “active sham” is large. This is not just an alternative-medicine pattern; it is a wellness, mental-health-app, productivity-tool, dietary-supplement, behavioral-nudge pattern as well. The default assumption when reading a claim of efficacy that rests on a no-treatment-control study should be that a substantial fraction of the effect is the ritual and that the specific-mechanism contribution may be small or zero. The exceptions exist but they are exceptions; they require active-comparator evidence to be established.

Third, the framework matters operationally for product design. If the bulk of the effect of an intervention comes from the therapeutic ritual rather than the specific mechanism, then the design priorities are different from what the mechanistic framing would suggest. Effort should go into the elements that strengthen the ritual: the consultation, the attention, the user experience, the narrative scaffold that helps the user interpret and engage with the intervention, the regularity of contact, the perceived expertise of whoever is delivering the intervention. These are exactly the elements that traditional acupuncture happens to be very good at, regardless of the meridian theory. They are also the elements that pharmaceutical and digital-therapeutic interventions, with their drug-delivery or app-based focus, often neglect. A product that takes the ritual seriously, alongside whatever specific mechanism it offers, is likely to outperform a product that focuses only on the specific mechanism. The cumulative acupuncture evidence is not just a debunking; it is a design lesson about what produces benefit in any intervention whose outcome is mediated through subjective patient experience.

The pattern is not unique to acupuncture. It is the pattern of any intervention into a subjective, fluctuating condition where expectation and attention are themselves causal inputs into the outcome. The contribution of the acupuncture trial literature is to have demonstrated this pattern with unusual rigor and unusual completeness, across many trials, with sham comparators that get progressively closer to isolating the specific contribution of the claimed mechanism, and with the answer converging on “small specific effect, much larger ritual effect” across condition after condition. That demonstration is the most useful generalizable evidence the alternative-medicine literature has produced. It is more useful as a methodological template than as a verdict on acupuncture itself.

Sources

Madsen, M. V., Gøtzsche, P. C., & Hróbjartsson, A. (2009). Acupuncture treatment for pain: Systematic review of randomised clinical trials with acupuncture, placebo acupuncture, and no acupuncture groups. BMJ, 338, a3115. https://doi.org/10.1136/bmj.a3115
Vickers, A. J., Cronin, A. M., Maschino, A. C., Lewith, G., MacPherson, H., Foster, N. E., Sherman, K. J., Witt, C. M., & Linde, K. (2012). Acupuncture for chronic pain: Individual patient data meta-analysis. Archives of Internal Medicine, 172(19), 1444-1453. https://doi.org/10.1001/archinternmed.2012.3654
Vickers, A. J., Vertosick, E. A., Lewith, G., MacPherson, H., Foster, N. E., Sherman, K. J., Irnich, D., Witt, C. M., & Linde, K. (2018). Acupuncture for chronic pain: Update of an individual patient data meta-analysis. Journal of Pain, 19(5), 455-474. https://doi.org/10.1016/j.jpain.2017.11.005
Colquhoun, D., & Novella, S. P. (2013). Acupuncture is theatrical placebo. Anesthesia & Analgesia, 116(6), 1360-1363. https://doi.org/10.1213/ANE.0b013e31828f2d5e
Linde, K., Allais, G., Brinkhaus, B., Fei, Y., Mehring, M., Vertosick, E. A., Vickers, A., & White, A. R. (2016). Acupuncture for the prevention of episodic migraine. Cochrane Database of Systematic Reviews, 6, CD001218. https://doi.org/10.1002/14651858.CD001218.pub3

FAQ

Does acupuncture work?

It depends on what you mean by “work.” Real acupuncture reliably produces better outcomes than no treatment at all for subjective chronic-pain conditions like back pain, neck pain, knee osteoarthritis, and migraine prevention. But real acupuncture barely outperforms sham acupuncture in well-controlled trials, with the difference typically small enough to be within the range that imperfect patient blinding could produce. The pattern across the highest-quality evidence base is that most of acupuncture’s effect comes from the therapeutic ritual --- the attention, the consultation, the patient expectation, the regression-to-the-mean inherent in seeking treatment at the worst moment of a chronic condition --- rather than from the specific needling mechanism.

What is sham acupuncture?

Sham acupuncture is a placebo control designed to mimic the experience of real acupuncture without delivering the specific intervention that traditional theory claims is therapeutic. Common forms include needling at non-meridian locations, superficial or minimal needling, retractable Streitberger needles that produce the sensation of insertion without actually penetrating the skin, and simulated needling with toothpicks tapped against the skin through a guide tube. The point of sham acupuncture is to separate the placebo effect of the therapeutic ritual from any specific effect of correct needling.

Does the Vickers IPD meta-analysis prove acupuncture has a specific effect beyond placebo?

The Vickers 2012 and Vickers 2018 individual-patient-data meta-analyses, pooling over 20,000 patients, found a small but statistically significant advantage of real acupuncture over sham acupuncture across four chronic pain conditions. Proponents read this as evidence of a real specific effect. Critics read the small magnitude --- standardized mean differences of 0.15 to 0.23 --- as plausibly explained by imperfect patient blinding rather than by a specific mechanism. The debate is unresolved at the level of the highest-quality trials. What the Vickers analyses do not support is the proposition that acupuncture’s effect is large or that the meridian theory of point specificity is correct.

What does the Madsen 2009 BMJ review show?

Madsen, Gøtzsche, and Hróbjartsson assembled trials that included three arms --- real acupuncture, sham acupuncture, and no acupuncture --- and decomposed the total effect into two components. The standardized mean difference of sham acupuncture versus no acupuncture was approximately 0.42, a moderate ritual effect. The standardized mean difference of real acupuncture versus sham acupuncture was approximately 0.17, a small specific effect. About three-quarters of the effect of acupuncture versus no treatment was attributable to the ritual rather than to specific needling.

What is the “theatrical placebo” framework?

Colquhoun and Novella’s 2013 Anesthesia & Analgesia editorial framed the cumulative evidence as showing acupuncture is a moderately effective placebo delivered through an elaborate therapeutic ritual. The needles, the meridians, and the points are part of the theatrical setup that allows the placebo response to take effect. The specific mechanism is not necessary to explain any of the observed effects; the ritual is sufficient. The framework is not dismissive of placebo effects, which are real for subjective chronic-pain conditions, but is dismissive of the specific meridian-and-point theory of how acupuncture is supposed to work.

Are the meridian points real?

The evidence from sham-controlled trials does not support the traditional meridian-and-point theory. If correct meridian points were therapeutically necessary, wrong-location sham acupuncture should perform substantially worse than correct-point acupuncture. Across the trial literature, wrong-location needling typically performs nearly identically to correct-point needling, with small and often non-significant differences. This prediction failure is one of the strongest pieces of evidence against the specificity claim of traditional acupuncture theory.

Should I get acupuncture for my back pain?

The honest summary of the evidence is that acupuncture is reasonable for chronic pain that has not responded to other interventions, but you should expect a modest effect, most of which will come from the therapeutic ritual rather than the specific needling. If the ritual works for you --- if the consultation, the attention, the calm environment, and the framework for interpreting your pain are themselves valuable --- then acupuncture is a reasonable choice. If you specifically want to invest in the meridian-and-point mechanism that traditional acupuncture claims, the evidence does not strongly support that this is what is producing the benefit.

What does this mean for evaluating other alternative-medicine claims?

The acupuncture evidence base is the clearest demonstration available of a general pattern: alternative-medicine effects shrink dramatically when the comparator is upgraded from “no treatment” to “active sham.” When evaluating any alternative-medicine claim, the diagnostic question is what the comparator was in the cited studies. If the comparator was a waitlist or no treatment, the study tells you little about the specific mechanism the product is claiming. If the comparator was a credible active sham and the intervention still outperformed, the evidence is more informative. The default assumption when reading a claim of efficacy based on a no-treatment comparison should be that a substantial fraction of the effect is the ritual rather than the specific mechanism.

Does this apply outside alternative medicine?

Yes. The pattern of “works better than nothing, barely better than active sham” is a wellness, mental-health-app, productivity-tool, dietary-supplement, and behavioral-nudge pattern as well as an alternative-medicine pattern. Almost any deliberate intervention into a fluctuating subjective condition --- pain, mood, focus, energy, sleep --- will outperform doing nothing because the user is paying attention to the problem and because the condition was likely worst at the moment they decided to intervene. The specific-mechanism contribution, when isolated by sham control, is often small. The strategic implication for evaluating any such product is the same as for acupuncture: ask what the comparator was, and discount efficacy claims that rest on no-treatment comparisons.

replication-crisisacupuncturesham-trial-evidencealternative-medicineevidence-evaluation

Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified in behavioral economics. Led 100+ in-house experiments at NRG in 2025, with project evidence and limits documented in the case studies.

About LinkedIn Newsletter

The Sham Acupuncture Methodology

Madsen 2009: The BMJ Systematic Review That Crystallized The Pattern

Vickers 2012 And Vickers 2018: The Individual Patient Data Meta-Analyses

Colquhoun And Novella: The “Theatrical Placebo” Framework

What Survives And What Does Not: The Cochrane Picture

The Implication For Traditional Meridian-Point Claims

The General Pattern: Compare Against Sham, Not No-Treatment

Strategist Takeaway

Sources

Related

FAQ

Related Articles

Cohen's d And The Misuse Of "Small/Medium/Large" Effect Sizes

The False Consensus Effect: Why You Think Everyone Agrees With You

The Barnum/Forer Effect: Why Personality Tests And Horoscopes Feel So Accurate

Get the WeeklyExperimentation Playbook

Get the Weekly
Experimentation Playbook