Skip to main content
← Glossary · Statistics & Methodology

Alpha Level

The pre-specified threshold for statistical significance — the probability of a Type I error you are willing to accept, conventionally 0.05.

What Is Alpha Level?

Alpha is the line in the sand you draw before running a test. It is the maximum probability of falsely declaring a winner that you are willing to tolerate. If alpha is 0.05, you accept a 5% false positive rate — per test, per metric, per comparison. Alpha is a choice, not a law of nature, and choosing it thoughtfully is one of the few decisions that separates a serious experimentation program from a theater of dashboards.

Also Known As

  • Data science: significance level, p-value threshold
  • Growth: "the 95% confidence bar"
  • Marketing: confidence level (1 - alpha)
  • Engineering: decision threshold

How It Works

Run a two-sided test with alpha = 0.05 and baseline 10% conversion. The critical z-value is ±1.96. Any observed z above 1.96 or below -1.96 crosses the bar. At alpha = 0.10 the critical z shrinks to ±1.645 — you ship more winners and accept more false positives. At alpha = 0.01 the bar rises to ±2.576 — fewer false positives, but you need much more sample to clear it.

Alpha interacts with sample size: at fixed effect and variance, halving alpha roughly requires 40% more data to retain the same power.

Best Practices

  • Set alpha based on the cost of being wrong. A paywall change deserves alpha 0.01; a button-color test can live at 0.10.
  • Lock alpha in the test doc before launch. Ex-post alpha-shopping is cheating.
  • Use alpha 0.025 one-sided only when a directional hypothesis is pre-registered and defensible.
  • Apply corrections for multiple comparisons when testing multiple variants or metrics.
  • Report the exact p-value, not just "significant / not significant," so future meta-analyses can be done.

Common Mistakes

  • Mistaking alpha for the probability the variant is worse. A p-value of 0.04 does not mean 96% chance control is wrong — that is a Bayesian interpretation of a frequentist number.
  • Using alpha 0.05 reflexively on everything regardless of stakes or reversibility.
  • Changing alpha mid-test because the result is "almost there."

Industry Context

In SaaS/B2B, alpha 0.10 is often more appropriate than 0.05 given the cost of missed improvements at low traffic. In ecommerce, high-volume teams can afford alpha 0.01 on revenue-moving tests to protect against shipping lossy variants. In lead gen, alpha choice should be tied to downstream economics — the cost of a false positive that degrades lead quality can be 10x the cost of missing a true positive on form fills.

The Behavioral Science Connection

Humans treat 0.05 as a magic number, a phenomenon called threshold fetishism. A p-value of 0.049 feels meaningfully different from 0.051, but the underlying evidence is nearly identical. Pre-registering alpha protects against the motivated reasoning that kicks in when a stakeholder-favored variant lands at p = 0.06.

Key Takeaway

Alpha is a business decision dressed up as a statistical one. Choose it deliberately, tie it to consequences, and lock it down before you see any data.