Winner Determination
The process of deciding which variant in an A/B test has genuinely outperformed the control, using statistical criteria and business judgment.
What Is Winner Determination?
Winner determination is the process of deciding whether an experiment has a shippable winner — and if so, which variant it is. It combines statistical criteria (p-values, confidence intervals, power), practical criteria (is the effect big enough to matter?), and business judgment (does shipping this create long-term value?). Done well, it compounds into an experimentation program that ships actual wins. Done poorly, it produces a flow of "ship it" decisions that don't move the business.
Also Known As
- Marketing teams call it declaring a winner or test call.
- Growth teams say winner determination or result call.
- Product teams use ship decision or test conclusion.
- Engineering teams refer to it as test verdict or experiment decision.
- Data science teams call it winner determination, shipping decision, or test conclusion.
How It Works
Test concludes after 28 days with pre-registered sample size reached. Primary metric: signup rate. Control: 4.1% ± 0.15%. Variant: 4.5% ± 0.16%. Chi-squared p = 0.003. Relative lift: +9.8%. Guardrail 1 (activation rate): variant 32.1% vs. control 32.4%, p = 0.34 — no significant degradation. Guardrail 2 (30-day retention): variant 68% vs. control 68.5%, p = 0.52 — no degradation. Practical significance check: a 9.8% lift on signup rate translates to ~$180K ARR annualized — clearly worth shipping. Ship decision: yes, with a 5% holdback to measure long-term impact.
Best Practices
- Pre-register winner criteria (primary metric threshold, guardrail rules, minimum effect size) before launch.
- Require statistical significance AND practical significance — both matter.
- Check all guardrails before shipping; a tripped guardrail usually means don't ship.
- Document the decision, including cases where you ship despite ambiguous statistics or don't ship despite a significant result.
- Default to not shipping when results are ambiguous — the cost of a false win compounds forever.
Common Mistakes
- Shipping based on p-value alone without checking practical significance or guardrails.
- Letting loudest-voice stakeholders overrule pre-registered criteria after seeing results.
- Treating "inconclusive" as "ship it anyway because we already built it."
Industry Context
- SaaS/B2B: Low traffic often produces ambiguous results; default-to-not-shipping is critical.
- Ecommerce/DTC: High traffic produces clearer results; focus on guardrail discipline.
- Lead gen: Lead quality guardrails often matter more than lead volume lift.
The Behavioral Science Connection
Winner determination is where confirmation bias and sunk cost fallacy are most dangerous. Teams that invested weeks building a variant want it to win — and they'll unconsciously rationalize ambiguous data to that conclusion. Pre-registration is the Ulysses contract that binds your post-result judgment to your pre-result criteria.
Key Takeaway
A winner requires statistical significance, practical significance, clean guardrails, and pre-registered criteria — if any of those are missing, default to not shipping.