Test Stopping Rules
Pre-defined criteria that specify when an A/B test should be concluded, preventing both premature stopping and unnecessarily prolonged experiments.
What Are Test Stopping Rules?
Test stopping rules are pre-defined criteria that specify exactly when an experiment should be concluded — before it launches. They prevent the two most common experimentation errors: stopping too early (declaring a winner before data is reliable) and stopping too late (wasting traffic on a test that's already decided). Without explicit rules, teams default to intuition, which reliably produces bad decisions.
Also Known As
- Marketing teams call them stopping criteria or test end conditions.
- Growth teams say stopping rules or ending criteria.
- Product teams use conclusion criteria or test exit rules.
- Engineering teams refer to stopping conditions or termination criteria.
- Statisticians distinguish fixed-horizon stopping and sequential stopping rules.
How It Works
Before launch, you document: "This test will end when (a) each variant reaches 25,000 users AND 14 calendar days have elapsed AND the primary metric is either statistically significant at p<0.05 with a relative effect of at least 3%, OR (b) 42 days have elapsed with no significance, in which case we declare inconclusive and stop." During the test, you check daily but do not stop until (a) or (b) is met. If marketing pressure wants to stop at day 7 because "it's clearly winning," you point at the pre-registered rule and keep running.
Best Practices
- Pre-register the rule and share it publicly with stakeholders before launch.
- Include both a minimum sample size AND minimum duration — never just one.
- If you need to peek, use sequential testing methods with adjusted significance thresholds.
- Allow "inconclusive" as a legitimate outcome, not a failure.
- Build automated alerts when stopping conditions are met rather than relying on manual checks.
Common Mistakes
- Stopping at day 3 because "it's winning clearly" — ignoring the peeking problem.
- Running tests forever when results are ambiguous, introducing seasonality and cookie-churn bias.
- Changing stopping rules mid-test after seeing preliminary data — which invalidates the test.
Industry Context
- SaaS/B2B: Patience is hardest here because traffic is slow; stopping rules matter most.
- Ecommerce/DTC: Seasonality makes strict duration caps especially important.
- Lead gen: Lead quality takes time to measure; stopping rules should include quality latency.
The Behavioral Science Connection
Stopping rules are the structural solution to hyperbolic discounting — our preference for immediate over delayed outcomes. By binding future decisions in advance, you override the in-the-moment temptation to act on premature data. It's Ulysses tying himself to the mast before the sirens.
Key Takeaway
Pre-register stopping rules before launch — they're the single most effective defense against the most common experimentation sin.