ICE Scoring Framework
A prioritization framework that scores experiment ideas by Impact, Confidence, and Ease — each on a 1-10 scale — to rank which tests to run first.
What Is the ICE Scoring Framework?
ICE scoring, popularized by Sean Ellis, is the most widely used experiment prioritization framework. Each test idea gets three scores: Impact (how much will this move the target metric?), Confidence (how sure are we this will work?), and Ease (how quickly can we implement this?). The ICE score is typically the average or product of these three numbers.
The value of ICE isn't mathematical precision — it's structured conversation. Forcing teams to articulate and defend scores for each dimension improves hypothesis quality regardless of the final number.
Also Known As
- Marketing: Campaign prioritization framework, ICE model
- Sales: Deal scoring, opportunity prioritization
- Growth: ICE, PIE (Potential/Importance/Ease), RICE
- Product: Feature prioritization framework, ICE matrix
- Engineering: Task prioritization scoring
- Data: Test prioritization model, impact-confidence-ease
How It Works
A growth team has 22 test ideas in their backlog. Rather than debating each one subjectively, they score each on Impact (1–10), Confidence (1–10), and Ease (1–10). A homepage hero test scores 8/7/5 (ICE = 280). A pricing page headline test scores 7/8/9 (ICE = 504). A full checkout redesign scores 9/5/2 (ICE = 90).
The pricing test surfaces to the top despite lower impact because its combination of strong evidence and fast execution produces a better bet. The team runs the top-scored tests first and revisits scores as new evidence emerges.
Best Practices
- Score Impact based on traffic × current conversion × expected lift, not intuition.
- Tie Confidence scores to specific evidence — research findings, past test results, competitor data.
- Score Ease against a standard reference — a copy test is a 9, a full redesign is a 2.
- Score as a team, not individually — the debate is the point.
- Re-score quarterly as evidence accumulates.
Common Mistakes
- Treating ICE scores as precise — they're directional, not deterministic.
- Inflating all scores so everything is 8+ across the board, which defeats the ranking.
- Ignoring strategic value — some low-ICE tests are worth running because they unlock future work.
Industry Context
SaaS/B2B: ICE works well for feature and flow tests. B2B teams often add a fourth dimension — strategic alignment — to capture tests that matter for positioning even if ICE is lower.
Ecommerce/DTC: ICE scales to large backlogs where quick ranking is needed. RICE (adding Reach) often fits ecommerce better because traffic varies widely across pages.
Lead gen: ICE is excellent for small teams making fast decisions across landing pages and forms where most tests are relatively quick to implement.
The Behavioral Science Connection
ICE scoring counters the planning fallacy in prioritization — the systematic tendency to optimistically estimate effort and impact. By forcing teams to write scores down and defend them, ICE makes overoptimistic estimates visible and correctable. It also combats authority bias by giving every team member an equal vote in the scoring conversation.
Key Takeaway
ICE isn't about getting the scores perfect — it's about surfacing the reasoning behind them, which is where real prioritization quality comes from.