Cohort Analysis
A method of analysis that groups users by a shared characteristic or experience within a defined time period, then tracks their behavior over time to reveal retention patterns, lifecycle trends, and long-term treatment effects.
What Is Cohort Analysis?
Cohort analysis groups users by when they joined (acquisition cohorts) or what they did (behavioral cohorts) and tracks each cohort's behavior over time. Instead of collapsing all users into a single pool, cohort analysis reveals longitudinal patterns — retention curves, revenue maturation, feature-adoption rates — that aggregate metrics hide.
Also Known As
- Marketing team: "cohort retention," "vintage analysis"
- Sales team: "deal cohorts," "customer vintage"
- Growth team: "retention cohorts," "signup cohorts"
- Data team: "cohort tables," "longitudinal analysis"
- Finance team: "revenue cohort curves," "customer vintage reporting"
- Product team: "activation cohorts," "behavioral cohorts"
How It Works
You build a retention matrix: rows are signup months (Jan 2025, Feb 2025, etc.), columns are months since signup (month 1, month 2, etc.), and each cell shows the % of the cohort still active. Jan 2025 cohort: 100% → 55% → 42% → 35% → 30%. April 2025 cohort: 100% → 48% → 33% → 25% → 20%. Aggregate retention looks flat because your growing user base masks that each new cohort retains worse than the last — a silent decline only cohort analysis reveals.
Best Practices
- Always disaggregate retention into cohorts; aggregate retention is a misleading average.
- Build both acquisition cohorts (when joined) and behavioral cohorts (completed onboarding, adopted feature X).
- Track long-term cohort metrics (90-day, 180-day, 365-day) in every A/B test, not just primary conversion.
- Compare cohorts across channels to identify acquisition quality differences invisible in CAC.
- Visualize with retention curves, not just matrices — shape matters.
Common Mistakes
- Reporting aggregate retention numbers that hide declining cohort quality.
- Comparing cohorts of vastly different ages (8-month-old vs. 2-month-old) without normalizing.
- Ignoring compositional shifts (Black Friday cohorts behave differently than January cohorts).
Industry Context
SaaS and B2B rely on cohort analysis for net dollar retention, logo retention, and expansion revenue by cohort. Ecommerce and DTC use cohort analysis for repeat purchase rates and LTV curves. Lead gen operations apply cohort analysis to MQL-to-close progression, tracking how different lead sources mature over time.
The Behavioral Science Connection
Aggregate retention numbers trigger the base rate fallacy — mistaking a pooled statistic for a meaningful signal about any specific cohort. Cohort analysis is the antidote: it forces you to see the underlying distribution. It also resists survivorship bias, where your oldest, stickiest users dominate aggregate metrics while newer cohorts quietly churn.
Key Takeaway
If you only look at aggregate retention, you will miss declining cohort quality until it's too late — cohort analysis is non-optional for any subscription or repeat-purchase business.