Skip to main content
← Glossary · Statistics & Methodology

Sequential Testing

A statistical approach that allows experimenters to check results continuously as data accumulates, rather than waiting for a fixed sample size — with proper controls for false positives.

Sequential testing solves one of the biggest practical problems in experimentation: the urge to peek at results before the test is "done." Traditional fixed-horizon testing says you must wait until the predetermined sample size is reached. Sequential testing says you can look anytime — as long as you adjust the significance thresholds accordingly.

Why Fixed-Horizon Testing Fails in Practice

In theory, you set a sample size, run the test, and analyze results once. In practice, stakeholders ask "is it winning yet?" on day 2. Product managers want to ship the winner before the quarter ends. Developers want to free up the experiment slot. Peeking is inevitable, and peeking in fixed-horizon tests inflates false positive rates dramatically.

How Sequential Testing Works

Sequential testing methods (like the always-valid p-value, mSPRT, or confidence sequences) adjust the significance threshold at each observation point. You pay a small penalty in statistical power for the ability to peek — but you eliminate the false positive inflation that makes peeked-at fixed-horizon tests unreliable.

Which Method to Use

Most modern experimentation platforms (Optimizely, Statsig, Eppo) use some form of sequential testing. If you're building your own analysis, the always-valid confidence sequence is the most practical approach. For simpler implementations, group sequential designs with O'Brien-Fleming spending functions work well.

Practical Application

If your organization peeks at experiment results (and it does), adopt sequential testing. The mathematical framework makes peeking safe. It's better to use a valid statistical method that matches your actual behavior than to use a theoretically optimal method that everyone violates.