Experiment Prioritization (ICE/PIE/RICE)
Frameworks for ranking A/B test ideas by expected value, using scoring systems like ICE (Impact, Confidence, Ease), PIE (Potential, Importance, Ease), or RICE (Reach, Impact, Confidence, Effort).
What Is Experiment Prioritization?
Experiment prioritization is the use of structured scoring frameworks — ICE, PIE, RICE, and others — to rank test ideas by expected value. Without them, test backlogs default to the HiPPO (Highest Paid Person's Opinion), the loudest advocate, or whatever generated excitement in the last meeting. With them, teams make consistently better allocation decisions and surface productive disagreements about where value lives.
Also Known As
- Marketing teams call it test prioritization, ICE scoring, or backlog scoring.
- Growth teams say ICE, RICE, PIE, or experiment scoring.
- Product teams use RICE (most common in product) or experiment prioritization.
- Engineering teams refer to ICE/RICE scoring or experiment ranking.
- Data science teams use expected value scoring or test ROI.
How It Works
Three test ideas: (1) new pricing page layout — ICE: Impact 8, Confidence 5, Ease 3 → score 120. (2) checkout button copy — ICE: Impact 3, Confidence 7, Ease 9 → score 189. (3) onboarding email redesign — ICE: Impact 6, Confidence 6, Ease 6 → score 216. By ICE, email wins — but the scores are within noise of each other. The real value comes from the discussion: why did the pricing advocate rate Impact 8? Why did the checkout advocate rate Ease 9? Where do scores diverge by 3+ points across team members? That divergence is where the learning lives.
Best Practices
- Pick one framework and use it consistently — switching frameworks wastes cognitive overhead.
- Score as a team exercise; don't let one person score for everyone.
- Focus discussion on items where scores diverge, not items with high aggregate scores.
- Use scores to separate quartiles, not to rank individual items with false precision.
- Reserve 20–30% of test slots for exploratory ideas with low scores but high learning potential.
Common Mistakes
- Treating scores as precise — a 7.3 vs a 7.1 difference is noise.
- Using a framework rigidly and skipping ideas with strategic or learning value.
- Letting the HiPPO override framework-top items for political reasons.
Industry Context
- SaaS/B2B: RICE fits well because Reach matters — some tests affect trial users, others affect paying customers.
- Ecommerce/DTC: ICE's simplicity usually wins; fast iteration beats frameworks with more inputs.
- Lead gen: PIE was invented for lead gen (Widerfunnel); Importance of traffic source matters here.
The Behavioral Science Connection
Prioritization frameworks combat availability bias — our tendency to overweight recent or vivid ideas. They also provide structure against sunk cost fallacy ("we already built half of this, let's test it") by forcing explicit scoring on the merits. The ritual of scoring matters more than the exact numbers it produces.
Key Takeaway
Pick a framework, use it consistently, focus discussion on score divergence, and never let the decimal point fool you into false precision.