Skip to main content
← Glossary · Statistics & Methodology

Sample Size

The number of visitors or users needed in each variation of an A/B test to detect a meaningful difference with statistical confidence.

Sample size is the most underappreciated factor in A/B testing. Running a test with too few visitors is like trying to hear a whisper in a crowded room — the signal gets lost in the noise. Running with too many wastes time and opportunity cost.

Why Sample Size Matters

An underpowered test (too small sample) will fail to detect real effects. This means potentially valuable changes get rejected as "inconclusive," and your team loses confidence in the testing program.

How to Calculate Sample Size

The three inputs that determine required sample size:

  • Baseline conversion rate: Your current rate (e.g., 3% checkout conversion)
  • Minimum Detectable Effect (MDE): The smallest improvement worth detecting (e.g., 10% relative lift = 3.0% → 3.3%)
  • Statistical power: The probability of detecting a real effect (standard: 80%)

Lower baseline rates and smaller MDEs require dramatically larger samples. Detecting a 5% relative lift on a 2% conversion rate requires roughly 4x the sample needed to detect a 20% relative lift.

The Hard Truth About Small Sites

If your site gets fewer than 10,000 visitors per month, you likely can't run valid A/B tests on low-conversion events (like purchases). You have three options:

  • Test higher-frequency events (clicks, form starts) as proxy metrics
  • Run longer tests (accepting the risk of seasonal/external confounds)
  • Use qualitative methods (user testing, surveys) instead of statistical testing

What I Tell Clients

"If you don't have enough traffic for a valid test, don't fake it by lowering your standards. An unreliable test result is worse than no result — because it gives false confidence."