E-commerce A/B testing has a dirty secret: most teams optimize for conversion rate when they should be optimizing for revenue per visitor. I’ve seen tests that increased CVR by 15% while destroying average order value — a net negative that took months to discover because nobody was looking at the right metric.

This happens more often than you’d think. A team adds aggressive discount banners, conversion rate jumps, everyone celebrates, and then the CFO notices margins cratered. Or someone simplifies the product page by removing upsell modules — fewer distractions, higher conversion, lower revenue.

The e-commerce funnel is full of these traps. This guide walks through where to test, what to measure, and how to avoid the mistakes that cost real money.

The E-Commerce Funnel and Where to Test

The standard e-commerce funnel looks like this:

Landing/Home Page → Category Page → Product Page → Cart → Checkout → Confirmation

Each stage has different high-impact test areas, and a critical principle applies: the further down the funnel, the higher the conversion impact per visitor. A 10% improvement in checkout completion directly hits revenue. A 10% improvement in homepage engagement might not move revenue at all if it doesn’t carry through the funnel.

That said, upper-funnel tests affect more traffic. The art is balancing impact-per-visitor against total visitor volume. I generally recommend a portfolio approach: 60% of tests on product pages and below, 40% on category pages and navigation. Use a prioritization framework (/blog/posts/how-to-prioritize-ab-tests-pxl-framework) to rank specific tests within each category.

Category Page Tests

Category pages are where browsing intent becomes buying intent. The key friction here is helping users find what they want without overwhelming them.

Grid vs. list layout. Grid layouts win for visual products — fashion, home decor, food — where the image drives the purchase decision. List layouts win for specification-heavy products — electronics, industrial supplies, software — where users need to compare details. Don’t assume grid is universally better.

Filter prominence and default sort order. I’ve seen filter redesigns produce 8-12% lifts in product page visits. The default sort order matters more than most people realize — “best selling” often outperforms “newest” or “price low to high” because it surfaces social proof implicitly.

Product card information density. How much information belongs on the card in the grid? Price is obvious. Star ratings consistently lift click-through. Shipping information (“Free shipping” or “Arrives by Friday”) can be powerful but adds visual clutter. Test the density — more information isn’t always better because of cognitive load.

Products per page. More products per page means more scrolling but fewer page loads. The optimal number depends on your product type and audience. I’ve seen cases where reducing from 48 to 24 products per page increased revenue because users engaged more deeply with each product instead of scrolling past everything.

Product Page Tests

The product page is where most revenue is won or lost. This is your highest-ROI testing surface.

Image gallery. The number of images, zoom functionality, 360-degree views, and video all impact conversion. For apparel, more images consistently win — users want to see the product from every angle, on different body types, in context. For commoditized products, one or two clear images may be sufficient. Test adding video — it tends to increase conversion rate and AOV simultaneously because it builds confidence.

Social proof placement. Where you put reviews matters as much as having them. Rating summaries above the fold (the star count and total review number) consistently outperform hiding reviews below the fold. Test pulling the best 2-3 review snippets into a “What customers say” section near the add-to-cart button.

Urgency and scarcity elements. Stock indicators (“Only 3 left”), countdown timers, and recent purchase notifications (“Sarah from Denver bought this 2 hours ago”) can lift conversion — but they also erode trust if overused or perceived as fake. Test carefully and measure return rates alongside conversion. A test that boosts CVR by pushing anxious purchases may increase returns by enough to negate the gain.

Trust badges. Security seals, guarantee badges, and return policy prominence reduce perceived risk. Their impact is largest for lesser-known brands and higher price points. On a well-known brand’s site, trust badges may do nothing. On a DTC startup’s site, they can move conversion meaningfully.

Product description format. Bullets vs. paragraphs, expandable sections vs. full display, technical specs in tables vs. prose. The right format depends on your product complexity. Test expandable sections — they reduce page overwhelm while keeping details accessible for users who want them.

Cart Page Tests

The cart is a leaky bucket for most e-commerce businesses. Users add items but don’t complete purchase — industry cart abandonment rates hover around 70%.

Cross-sell and upsell placement. Relevant product recommendations in the cart can lift AOV by 5-15%, but irrelevant recommendations are noise that distracts from checkout. The key word is relevant — “Customers also bought” works when the suggestions genuinely complement the cart contents. Test placement (above vs. below cart items), number of suggestions (3 tends to beat 6), and recommendation algorithm.

Progress indicators. Showing users where they are in the checkout process (Cart → Shipping → Payment → Confirmation) reduces anxiety and abandonment. This is one of the more reliable wins in e-commerce testing.

Shipping cost visibility. Surprise shipping costs at checkout are the number-one cart abandonment reason across every study I’ve seen. Test showing shipping estimates on the cart page or even on the product page. “Free shipping on orders over $50” messaging can simultaneously increase conversion and AOV as users add items to hit the threshold.

Cart persistence. Saved carts and abandoned cart reminders are technically not A/B tests of the page itself, but testing the timing and content of cart reminder emails is high-value. Test sending the first reminder at 1 hour vs. 4 hours vs. 24 hours.

Checkout Tests

Checkout is the narrowest part of the funnel and the highest-stakes testing surface. Small friction here directly costs revenue.

Guest checkout vs. forced account creation. If you’re still forcing account creation before purchase, stop. Guest checkout almost always wins. You can prompt account creation after purchase on the confirmation page when the user is already committed. This is one of the few near-universal truths in e-commerce testing.

Number of form fields. Every form field is friction. Test removing optional fields, combining fields (full name instead of first/last), and using smart defaults. Address autocomplete alone can measurably improve checkout completion rates.

Payment option variety. Buy Now Pay Later options (Klarna, Affirm, Afterpay) can lift conversion 10-30% on higher-priced items. Digital wallets (Apple Pay, Google Pay) reduce friction on mobile. Test adding these options — but measure the total cost including payment processing fees.

Mobile-specific checkout optimization. Mobile is where most checkout abandonment happens. Thumb-friendly tap targets, minimal typing, autofill compatibility, and single-page checkout designs all matter. Test a dedicated mobile checkout flow rather than relying on responsive design to shrink your desktop checkout.

For a deeper look at how to set up these experiments (/blog/posts/how-to-set-up-ab-test-hypothesis-implementation) properly, including hypothesis writing and implementation pitfalls, see the setup guide in this series.

The Metrics Hierarchy: Why RPV Beats CVR

This is the single most important concept in e-commerce testing, and getting it wrong is expensive.

Conversion rate (CVR) is the easiest metric to measure and the most commonly used. But it’s misleading in isolation because it ignores order value entirely.

Average order value (AOV) captures how much each buyer spends. The problem: tests can move CVR and AOV in opposite directions. Add a discount? CVR goes up, AOV goes down. Remove an upsell? CVR goes up (less friction), AOV goes down (no upsell).

Revenue per visitor (RPV = CVR × AOV) is the master metric that captures both effects. A test that increases CVR by 10% but drops AOV by 15% looks like a win on CVR and a loss on AOV — but RPV reveals it’s a net revenue loser. RPV should be your primary metric for the majority of e-commerce tests.

Lifetime value (LTV) is the ultimate metric, but it’s impractical for most test windows. You can’t wait 12 months to measure an experiment. Use LTV as a guardrail — check whether changes that improve short-term RPV also hold up for repeat purchase behavior over 60-90 days.

For the statistical mechanics of testing ratio metrics like RPV (/blog/posts/ab-testing-statistics-p-values-confidence-intervals), you may need specialized approaches like the delta method. Revenue data is also notoriously high-variance, making CUPED and variance reduction (/blog/posts/cuped-variance-reduction-faster-ab-tests) especially valuable.

Seasonality: The E-Commerce Wild Card

E-commerce has seasonality patterns that can wreck your test results if you’re not careful.

Holiday results do NOT generalize. A test that wins during Black Friday — when purchase intent is at maximum and price sensitivity is different — may not hold in January. Always note the season when archiving results and be skeptical of generalizing holiday findings.

Weekly patterns matter. Weekend shoppers behave differently from weekday shoppers. Some categories see 40% higher conversion on Sundays. Run tests for full weeks (not partial weeks) to avoid day-of-week bias. The article on how long to run tests (/blog/posts/how-long-to-run-ab-test-sample-size) covers this in detail.

Product launch spikes distort results. If you launch a new product during a test, the traffic and behavior patterns shift in ways that contaminate your results. Avoid overlapping product launches with test periods, or exclude launch-related traffic from your analysis.

Always record the context of every test: dates, any promotions running, major product changes, external events. A test result without context is a test result you can’t learn from.

Mobile vs. Desktop: Two Different Worlds

Mobile represents 60%+ of traffic for most e-commerce sites but converts at roughly half the rate of desktop. This gap is not a “mobile optimization” problem to be solved with responsive design — it reflects fundamentally different browsing behavior.

Test separately. What works on desktop frequently fails on mobile. A complex product comparison tool might lift desktop conversion while hurting mobile where the screen real estate doesn’t support it. Run device-specific analyses for every test, and consider mobile-specific variants when the segmentation data (/blog/posts/ab-testing-segmentation-targeting-heterogeneous-effects) justifies it.

Mobile-specific concerns. Thumb zones determine which page areas get tapped. Tap targets need to be at least 44px. Scroll behavior differs — mobile users scroll more but engage less per scroll. Form input is painful on mobile, so every field you remove has an outsized effect.

Responsive isn’t enough. Many teams assume responsive design handles mobile. It doesn’t. A responsive product page still has the same content hierarchy, the same number of sections, the same information architecture — just reflowed for a smaller screen. Test building genuinely different mobile experiences, not just reflowed desktop ones.

What New Analysts Get Wrong

The biggest mistake in e-commerce testing is testing cosmetic changes when fundamental user objections aren’t addressed. I’ve watched teams run 6 button color tests while their product page had zero reviews and no visible return policy. Prioritize tests that address why users aren’t buying — trust, value clarity, friction — not how the page looks.

The second mistake is celebrating CVR wins without checking AOV. I cannot stress this enough. If your test report doesn’t include RPV, it’s incomplete. I’ve seen teams implement “winning” tests that actually lost the company money because nobody looked at order value.

The third mistake is ignoring the analysis (/blog/posts/how-to-analyze-ab-test-results-segmentation) after the headline result. A test that’s flat overall might be a +15% win on mobile and a -10% loss on desktop. Segment by device, by new vs. returning, by traffic source. The insights in the segments are often more valuable than the topline.

Pro Tips

Always measure RPV, not just conversion rate. Build this into your test reports as the default primary metric. If your experimentation platform doesn’t calculate RPV natively, compute it yourself — it’s just CVR multiplied by AOV.

Build a test backlog organized by funnel stage. When you tag every test idea with its funnel position, patterns emerge. You might realize you’ve been over-testing the homepage and under-testing checkout. Balance your portfolio.

Screenshot every test variant and archive results with context. Six months from now, someone will ask “did we ever test reviews on the product page?” If you can pull up the exact variant, the result, and the context (season, traffic level, platform mix), you avoid re-running old tests and you build institutional knowledge.

Test pricing presentation, not just page elements. How you display the price (with or without cents, crossed-out original price, per-unit pricing, installment options) can move both CVR and AOV. Pricing presentation tests are some of the highest-ROI experiments in e-commerce. See the dedicated article on A/B testing pricing strategy (/blog/posts/ab-testing-product-pricing-strategy) for more.

When to Stop Testing and Ship

E-commerce teams sometimes fall into perpetual testing mode — testing every minor variation instead of shipping improvements and moving on. Not every decision needs a test. If you’re debating whether to add Apple Pay to your checkout, don’t A/B test it — just add it. The cost of delay exceeds the cost of being wrong on a low-risk, industry-standard change.

Reserve your testing capacity for decisions where the outcome is genuinely uncertain and the stakes are high enough to justify the time investment. The article on A/B testing tradeoffs (/blog/posts/ab-testing-tradeoffs-when-not-to-test) explores this decision framework in more depth.

The goal isn’t to test everything. The goal is to make better revenue decisions faster. Every technique in this guide — from RPV measurement to funnel-stage prioritization — serves that single objective.

Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Written by Atticus Li

Revenue & experimentation leader — behavioral economics, CRO, and AI. CXL & Mindworx certified. $30M+ in verified impact.