How To Pick One North Star Metric For Experiments

Atticus Li

← Blog · behavioral-science

How To Pick One North Star Metric For Experiments

If your team runs experimentation, you already know the ugly part: the results meeting turns into a debate about which metric “matters.” Someone points at conversion. Someone else points at retention. Finance wants revenue. Product wants engagement. When you don’t have a single North Star Metric, ev

By Atticus Li February 21, 2026 7 min read

If your team runs experimentation, you already know the ugly part: the results meeting turns into a debate about which metric “matters.” Someone points at conversion. Someone else points at retention. Finance wants revenue. Product wants engagement.

When you don’t have a single North Star Metric, every A/B testing process becomes politics. You ship noisy wins, miss real wins, and waste cycles arguing.

I’m going to show you how I pick one North Star Metric for an experimentation program to drive revenue growth. Not a poster metric. A primary metric for your growth model that improves decision making under uncertainty.

What a north star metric must do (or your experiments won’t compound)

!Minimalist black-and-white vector infographic with blue accents showing a four-step flowchart for selecting a north star metric for experiments, featuring icons for revenue/retention, user value, speed of change, and resistance to gaming.

Flowchart to identify North Star Metric that stays tied to cash outcomes, created with AI.

A north star metric is not “the most important number in the company.” In an experimentation context, it’s the primary metric you agree to optimize when tradeoffs show up.

Here’s what I require before I let a metric become the north star:

First, it has to connect to lagging indicators like revenue growth or retention with a straight face. I don’t need perfect attribution, but I need a believable chain: metric up, cash up (now or later). If you can’t explain that chain in 60 seconds, the metric is a distraction.

Second, it must represent a user value moment. This is where behavioral science earns its keep. People don’t buy because your funnel is pretty. They buy because they felt customer value, reduced effort, or avoided loss. Your north star should track the user behavior that happens right after value is delivered (not the behavior that happens when someone is merely curious).

Third, it has to move fast enough as a leading indicator to be useful for experimentation. If your metric needs 90 days to show signal, your program will drift into vibes. For startup growth, speed matters because runway is short and learning needs to be tight.

Fourth, it must be hard to game, and pair it with guardrail metrics. If a team can inflate the metric without improving the product, they will. Not because they’re bad people, but because incentives work. A metric that’s easy to game will turn your growth strategy into theater.

If you want a solid baseline definition and examples, I generally align with Amplitude’s guide to finding a North Star Metric, then I tighten it for experiments.

My rule: if the metric doesn’t change when the user gets more value, it’s not your north star.

This is also where product-led growth either becomes real or becomes a slide, aligning with acquisition retention monetization frameworks. In PLG, the product is the sales motion. So the north star serves as the fundamental unit of value, sitting close to “user got value,” not “we got traffic.”

How I pick the metric in practice: start at cash, then walk backward to behavior

I start with the P&L, then I move backward to the product.

Why? Because experiments are expensive. Even “simple” tests eat design, engineering, QA, analysis, and opportunity cost. If your north star doesn’t line up with how you make money and align with business goals, your experimentation roadmap will feel busy and still miss the quarter. The key is to find the right unit of value.

Here’s the selection process I use:

I write down the cash outcome I care about most in the next 6 to 12 months (new revenue, expansion, churn reduction).
I name the user value moment that has a causal connection to that cash outcome.
I list 3 to 5 candidate metrics that reflect that moment.
I pick the one that best balances speed, integrity, and cash alignment.
I keep the others as secondary metrics or guardrails, not co-equal goals.

This quick table is how I pressure-test candidates before I commit:

A concrete example from B2B SaaS: I’ll often choose activated accounts per week as the north star for growth efficiency, where “activated” is strict (for example, created first project, invited 1 teammate, hit a success event). Then I model the financial impact with customer lifetime value in mind:

If activated-to-paid is 18%
Average first-year gross margin is $1,800
Then each additional activated account is worth about $324 in expected gross margin (0.18 × 1,800)

Now your A/B testing program has a scoreboard that finance understands. More importantly, your team can compare experiments that move different parts of the funnel by converting them into the same unit of value.

This is where analytics matters. If you can’t measure activation cleanly, don’t pretend. Fix instrumentation first, or your north star becomes a random number generator.

Applied AI can help here, but I keep it in its place. I’ll use a simple model to identify which early behaviors predict retention or expansion. Still, I don’t make “model score” the north star. I use it to validate that my chosen metric is pointed at future cash, not just today’s clicks.

For teams building a real experimentation culture, I also like Speero’s take on why programs exist in the first place, which is to learn under uncertainty and scale wins, not to celebrate tests: why experimentation drives business growth.

The tradeoffs that break north star metrics (and how I avoid the expensive mistakes)

!Clean minimalist black-and-white vector infographic with green accents showing three north star metric examples for startup growth: Marketplace matches per week, SaaS activated users per week, and Content site returning readers per day, with icons and vanity metric warnings.

Examples of north star metrics by business model, created with AI.

Most north star metric failures look like “we picked something reasonable,” then six weeks later the experiment backlog is a mess of secondary metrics.

These are the failure modes I see most:

Vanity metrics sneak in. Pageviews, raw signups, app opens. Vanity metrics like these micro-conversions move fast, so they feel good. Yet they rarely hold up when you tie them to macro-conversions that drive margin. If the metric makes the team cheer but doesn’t change cash, kill it.

The metric is too slow. Retention and revenue are ultimate outcomes, but they can be painful as the primary north star for experimentation. If you’re early and moving fast, pick a leading indicator that you’ve proven predicts retention, then guardrail cohort retention so you don’t burn the future.

One metric can’t cover two products. If you have a marketplace plus a SaaS tool, forcing a single number across both will produce bad local decisions. In that case, I still pick one company north star, but experimentation requires balancing different input metrics; I run experiments with a domain north star and map both to the company number.

Teams optimize around the metric, not the user. This is behavioral economics in the real world. People respond to incentives. If “activated” can be faked by spammy invites or empty projects, it will be. Fix it by tightening the definition, adding a quality threshold, or pairing it with a guardrail like downstream conversion.

The metric doesn’t match the constraint. Sometimes the constraint is sales capacity, onboarding support, or inventory. If your bottleneck is not demand, then pushing top-of-funnel conversion can raise costs without raising revenue.

When should you ignore all of this? If you’re pre-product-market fit and still searching for who the user is, don’t overcommit to a north star. Pick a temporary learning metric (like “users who reach the aha moment”) and revisit every month. Also, if you’re in a regulated workflow where cycles are long, you may need a slower north star and a different experimentation cadence.

Conclusion: commit to one metric, then make it earn its place

A North Star Metric serves as your primary metric and commitment device. It reduces noise, speeds up decision making, and makes your experimentation program comparable across teams.

My concrete next step: pick 3 candidates that align with your business goals and acquisition, retention, monetization strategy, run them through (1) cash link, (2) value moment, (3) speed, (4) game resistance, then choose one north star metric for the next 90 days. Write it down, define it tightly, and review it every month with one question: did optimizing it improve the conversion rate and revenue growth, or just prettier charts?

behavioral-science

Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified behavioral economist (1 of ~1,000 worldwide). 200+ A/B tests across energy, SaaS, fintech, e-commerce, and marketplace verticals.

About LinkedIn Newsletter

How To Pick One North Star Metric For Experiments

What a north star metric must do (or your experiments won’t compound)

How I pick the metric in practice: start at cash, then walk backward to behavior

The tradeoffs that break north star metrics (and how I avoid the expensive mistakes)

Conclusion: commit to one metric, then make it earn its place

Three places this work shows up.

GrowthLayer

Consulting

Jobsolv

Get the Weekly
Experimentation Playbook

What a north star metric must do (or your experiments won’t compound)

How I pick the metric in practice: start at cash, then walk backward to behavior

The tradeoffs that break north star metrics (and how I avoid the expensive mistakes)

Conclusion: commit to one metric, then make it earn its place

Related Articles

The Commitment Trap: Why Forcing Users to Opt-In Destroys Conversions (and What Loss Aversion Actually Predicts)

The Six-Figure Decision: How Strategic Price De-Emphasis Reveals the True Economics of Attention

The Unmeasured Cost of Bad UX: What Your Funnel Won't Show You

Related Articles

The Commitment Trap: Why Forcing Users to Opt-In Destroys Conversions (and What Loss Aversion Actually Predicts)

The Six-Figure Decision: How Strategic Price De-Emphasis Reveals the True Economics of Attention

The Unmeasured Cost of Bad UX: What Your Funnel Won't Show You

Three places this work shows up.

GrowthLayer

Consulting

Jobsolv

Get the WeeklyExperimentation Playbook

Get the Weekly
Experimentation Playbook