Most Optimizely users pick "conversion" as their metric type and move on. Which means they're leaving significant analytical precision on the table — and in some cases, drawing completely wrong conclusions.

Optimizely supports three distinct metric types, and they answer genuinely different questions. Using the wrong one isn't just a minor technical issue; it can make a losing test look like a winner.

The Three Metric Types, Simply Explained

Conversion Metrics (Binary)

A binary 1 or 0 per visitor. Did they do the thing or not?

Examples: purchased (yes/no), clicked CTA (yes/no), reached confirmation page (yes/no), submitted form (yes/no)

What it measures: the percentage of visitors who performed the action at least once

Formula: converters / total visitors

When the denominator matters: all visitors in the experiment, regardless of whether they converted

Numeric / Revenue Metrics

A sum or average of a numeric value per visitor.

Examples: total revenue, total items added to cart, total session duration

What it measures: the total or average value of the numeric event across all visitors

Formula: sum of values / total visitors (for "per visitor" averaging)

Important: visitors who never trigger the event contribute 0 to the numerator. This makes the distribution highly skewed — lots of zeros, a few large values — which increases variance.

Ratio Metrics

A calculated ratio of two numeric values, where both the numerator and denominator are event-based.

Examples: revenue per purchase (revenue / number of purchases), pages per session (page views / sessions), items per order (items added / purchases made)

What it measures: the average value of the numerator event per occurrence of the denominator event

The key difference: the denominator is event count, not visitor count. You're measuring something like "average order value" (total revenue / total orders), not "revenue per visitor."

When to Use Each

If you're asking "did MORE people do X," use a conversion metric. If you're asking "did people do X MORE (in terms of value or quantity)," use a numeric or ratio metric. These are different questions with different answers.

Did more people buy? → Conversion (Purchase CVR). How much revenue per visitor? → Numeric/Revenue (Revenue per visitor). What was the average order value? → Ratio (Revenue / purchases). Did checkout starts increase? → Conversion (Checkout start rate). Did people add more items per cart? → Ratio (Items / add-to-cart events). Did video engagement time increase? → Numeric (Total watch time per visitor). Did form completion improve? → Conversion (Form submit rate). Did repeat purchase rate change? → Ratio (Purchases / sessions).

**Pro Tip:** Ratio metrics sound precise but have a hidden complexity — the denominator is event-based, not visitor-based. A ratio metric excludes visitors who never trigger the denominator event, which can dramatically reduce your effective sample size. Always check how many visitors are actually contributing to a ratio metric before trusting the result.

Why Revenue Per Visitor Beats Revenue Per Purchase for Most Ecommerce Tests

This is the metric question I get most often from ecommerce teams.

Revenue per purchase (a ratio metric) = total revenue / total number of purchases

This answers: "When someone buys, how much do they spend on average?"

Revenue per visitor (a numeric metric) = total revenue / total visitors

This answers: "On average, how much revenue does each visitor to this experience generate?"

For most ecommerce experiments, revenue per visitor is the better primary metric. Here's why:

A test that changes your product page might simultaneously:

  • Increase conversion rate (more people buy)
  • Decrease average order value (people buy simpler, cheaper items due to simplified UI)

If you use revenue per purchase as your metric, you'd see AOV go down and flag the test as a loss. But if you use revenue per visitor, you might see it go up because the CVR increase more than compensates for the AOV decrease.

Worked Example

Baseline: 10,000 visitors, 3% CVR, $80 AOV

  • Revenue per purchase: $80
  • Revenue per visitor: $2.40 (3% x $80)

Variant: 10,000 visitors, 3.5% CVR, $72 AOV

  • Revenue per purchase: $72 (looks worse — a loss?)
  • Revenue per visitor: $2.52 (3.5% x $72)

Revenue per visitor increased by 5%. Revenue per purchase decreased by 10%. Same test, opposite conclusions depending on the metric.

The business impact: revenue per visitor is what determines your total revenue. If 100,000 visitors see the variant, the baseline generates $240,000 and the variant generates $252,000. The variant wins.

When revenue per purchase IS the right metric: If your test hypothesis is specifically about improving the checkout experience for people who are already committed to buying — and you're not expecting CVR to change — then AOV / revenue per purchase is the right lens.

**Pro Tip:** For tests that could affect both CVR and AOV (any product page, homepage, or discovery experience test), always include both revenue per visitor AND revenue per purchase as metrics — the first as primary, the second as secondary. If they diverge, you've found something interesting about how the change affects buyer behavior.

The Variance Problem With Revenue Metrics

Revenue metrics have a statistical problem: they're highly skewed.

In a typical ecommerce scenario:

  • 97% of visitors spend $0
  • 2.5% spend $20-$150
  • 0.5% spend $200-$2,000+

The high-value tail creates enormous variance. Standard deviation on revenue per visitor might be $18 on a mean of $2.40 — a coefficient of variation of 750%. This is a nightmare for statistical testing.

What this means practically: you need substantially larger samples to detect a meaningful lift in revenue per visitor vs conversion rate.

Sample size comparison (10% relative MDE, 95% confidence, 80% power):

Conversion rate at 3% baseline → approximately 28,000 visitors per variation. Revenue per visitor at $2.40 mean (sigma approximately $18) → approximately 175,000 visitors per variation.

Revenue metrics need 6x more traffic than conversion metrics for the same relative effect size. If you don't have the traffic, using revenue per visitor as your primary metric will result in underpowered tests that run too long.

The workaround: use revenue per visitor as your primary metric but cap outlier transactions (e.g., orders over $500 are capped at $500 for analysis purposes). This reduces variance significantly while still capturing the economic story. Discuss this with your analytics team before implementing.

**Pro Tip:** Winsorization (capping outlier values at a percentile threshold) is standard practice in e-commerce experimentation. Capping at the 95th or 99th percentile of order value dramatically reduces variance without materially affecting the analysis for most tests. Optimizely doesn't do this automatically — you'll need to implement it via custom event values if you want winsorized revenue metrics.

Ratio Metric Gotchas

Ratio metrics are powerful but have specific failure modes:

Zero Denominator Problem

If a visitor never triggers the denominator event, they can't contribute to the ratio. This is usually fine — they're simply excluded from the ratio calculation. But it means your ratio metric's effective population is smaller than your total experiment traffic, which affects your sample size calculations.

Example: "Revenue per purchase" only counts visitors who purchased. If your test is about the pre-purchase experience, you're excluding the majority of your traffic from the primary metric.

Outlier Sensitivity

Ratio metrics are more sensitive to outliers than conversion metrics because they include the actual value, not just the binary fact. A single $5,000 order in a small sample can swing the average revenue per purchase dramatically. Apply the same winsorization thinking here.

Correlation Between Numerator and Denominator

If your test changes both the numerator event rate and the denominator event rate simultaneously, interpreting the ratio becomes complex. Example: a test that causes both more purchases (denominator increases) and higher total revenue (numerator increases), but disproportionately increases low-value purchases. Revenue per purchase could go down even as total revenue goes up.

Always look at numerator and denominator metrics separately alongside the ratio.

**Pro Tip:** When using a ratio metric, add both the numerator metric and denominator metric as secondary metrics. If the ratio moves, you need to know whether it's because the numerator moved, the denominator moved, or both. You can't tell from the ratio alone.

When Total Value Metrics Make Sense

Total value metrics (sum, not average) are useful when you care about aggregate impact rather than per-visitor or per-event averages.

Example: "Total revenue generated in the experiment window"

This sounds appealing but is almost never the right primary metric for A/B tests, because total value is heavily influenced by sample size. If, by chance, the variant group has slightly more visitors, it will show higher total revenue even if there's no per-visitor effect.

The right unit of analysis is always per-visitor or per-session, not total. Total value metrics are useful for post-hoc business impact calculations ("if we ship this variant, how much incremental revenue do we expect per month?") but not for statistical testing.

Practical Setup Examples in Optimizely

Setting Up Revenue Per Visitor

  1. Go to Experiment > Metrics
  2. Click "Add Metric"
  3. Select your revenue event (e.g., purchasecompleted)
  4. Set aggregation to "Revenue" or "Numeric value"
  5. Set numerator to "Total revenue value" and denominator to "Unique visitors"

Note: you need to pass the revenue value when firing the event. In Optimizely's JavaScript API, pass a revenue tag with the value in cents (so $85.00 = 8500).

Setting Up a Ratio Metric

  1. Add Metric > select the numerator event
  2. Set aggregation to "Sum"
  3. Set denominator to a different event (not visitor count)
  4. Example: numerator = totalrevenue (sum), denominator = purchasecount (count of events)
**Pro Tip:** Revenue values in Optimizely are typically passed in cents (integer). $85.00 = 8500. A common bug is passing $85 (the dollar value) instead of 8500 (cents), resulting in revenue metrics 100x lower than reality. Always verify in the Optimizely Results page immediately after your first test purchase fires the event.

Common Mistakes

Using revenue per purchase (ratio) when you should use revenue per visitor (numeric). Revenue per purchase doesn't tell you what happened to the visitors who didn't buy. For most ecommerce tests, revenue per visitor is more informative.

Not accounting for revenue metric variance when estimating test duration. Revenue metrics need 4-6x more traffic than conversion metrics for the same statistical power. Underestimate this and you'll stop tests too early.

Ignoring the zero-denominator issue with ratio metrics. If most visitors don't trigger the denominator event, your ratio metric has effectively much lower statistical power than the total traffic suggests.

Passing revenue values in the wrong units. Dollars vs cents confusion is the most common implementation bug in revenue tracking. Verify event values immediately after launch.

Not adding guardrail metrics alongside revenue metrics. A test that increases revenue per visitor by 5% while increasing refund rate by 15% is not a winner. Always track downstream quality metrics.

What to Do Next

  1. Audit your current experiment metric setup — for each live experiment, identify whether the metric type (conversion, numeric, ratio) actually matches the question you're trying to answer. Rebuild any metrics that are misaligned.
  2. Implement revenue value tracking if you haven't already — make sure your purchase events pass the actual transaction value via the revenue tag in Optimizely's event push. Without this, you can only use conversion metrics for purchase events.
  3. Run a revised sample size calculation for any revenue-metric experiments — use the actual variance of your revenue per visitor (pull it from your analytics tool) to get an accurate estimate of required sample size.
  4. Add both metrics (RPV and revenue per purchase) to your next ecommerce test and watch how they move together. Understanding the relationship between CVR, AOV, and RPV is one of the most valuable analytical habits an experimenter can build.
Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Written by Atticus Li

Revenue & experimentation leader — behavioral economics, CRO, and AI. CXL & Mindworx certified. $30M+ in verified impact.