Bonferroni Correction
A method for adjusting significance thresholds when performing multiple statistical comparisons, dividing the desired alpha level by the number of tests to control the overall false positive rate.
What Is the Bonferroni Correction?
Bonferroni is the simplest way to keep multiple comparisons from producing bogus wins. If you are running k tests and want an overall false positive rate of alpha, require each individual test to clear alpha / k. It trades statistical power for a strict family-wise error guarantee.
Also Known As
- Data science teams: Bonferroni, alpha adjustment, family-wise error rate control
- Growth teams: multi-variant correction
- Marketing teams: the "divide alpha by number of variants" rule
- Engineering teams: FWER correction
How It Works
Imagine running an A/B/C/D test with 10,000 visitors per variant and three comparisons against control. At a naive alpha of 0.05, each test has a 5% false-positive chance, and the probability that at least one of three is a false positive is roughly 14%. Bonferroni fixes this by requiring each test clear p < 0.0167 (0.05 / 3). If one of your challengers gets p = 0.025, it looked significant at first glance — but after correction, it is not.
Best Practices
- Do apply Bonferroni when any false positive carries serious cost (e.g., pharma-style decisions).
- Do pre-specify which metrics are primary so you are only correcting across real hypotheses.
- Do consider Holm-Bonferroni for a slightly more powerful alternative with the same guarantee.
- Do not apply Bonferroni across dozens of exploratory metrics; switch to FDR methods instead.
- Do not correct within a single test that already reports a single primary metric.
Common Mistakes
- Forgetting that peeking across time is also a multiple comparisons problem.
- Applying Bonferroni across correlated metrics where it overcorrects severely.
- Treating Bonferroni as the only option when FDR methods would be more appropriate.
Industry Context
- SaaS/B2B: Multi-variant tests are common; Bonferroni keeps you honest.
- Ecommerce/DTC: Primary-metric discipline often avoids the need for corrections entirely.
- Lead gen/services: Small-sample programs already lack power; Bonferroni can crush it further.
The Behavioral Science Connection
Bonferroni counteracts what behavioral economists call "the dredging bias": the human tendency to find meaning in any surface you examine long enough. Kahneman's "looking for patterns in randomness" describes precisely what unchecked multiple testing produces. Bonferroni is a statistical rule that enforces modesty.
Key Takeaway
Use Bonferroni when false positives are expensive and comparisons are few; switch to FDR methods when you have many exploratory tests.