What is Bootstrap Confidence Intervals?

Atticus Li

← Glossary · Statistics & Methodology

Bootstrap Confidence Intervals

A resampling method that estimates the sampling distribution of a statistic by repeatedly drawing with replacement from the observed data.

The bootstrap generates confidence intervals without assuming a parametric form for the sampling distribution. Instead of "invoke the CLT and use ±1.96 * SE," you resample your actual data with replacement thousands of times, compute the statistic each time, and read off the 2.5th and 97.5th percentiles. For any statistic — median, ratio, winsorized mean, complex percentile — the bootstrap just works.

Also Known As

Data science: bootstrap, resampling CI, percentile bootstrap, BCa bootstrap
Growth: "empirical confidence interval"
Marketing: nonparametric CI
Engineering: resampled uncertainty estimate

How It Works

You have 18,000 users in each arm with revenue data (heavy right tail). You want a CI on the ratio of mean revenue (variant / control). Analytic CI on ratios is messy. Instead: resample 18,000 users with replacement from each arm, compute the ratio, repeat 10,000 times. The 2.5% and 97.5% percentiles of those 10,000 ratios give the 95% bootstrap CI — say [1.02, 1.11]. Done.

BCa (bias-corrected and accelerated) bootstrap improves coverage in skewed distributions and is the default in modern practice.

Best Practices

Use 10,000+ resamples for tail-relevant CIs; 1,000 is fine for the center.
Use BCa bootstrap when data is skewed — pure percentile bootstrap undercovers in skewed cases.
Bootstrap the right unit — in experiments with repeated observations per user, cluster-bootstrap at the user level.
Seed your resampler so results are reproducible.
Compare bootstrap CI to analytic CI — large disagreements often expose a bug or heavy skew.

Common Mistakes

Bootstrapping observations when users contribute multiple rows. This understates uncertainty; bootstrap users, not rows.
Under-resampling. CI endpoints have their own noise; too few reps leaves wobbly bounds.
Using bootstrap on tiny samples. Below n=30 the bootstrap underestimates uncertainty — normal approximations sometimes do better.

Industry Context

In SaaS/B2B, bootstrap handles ratio metrics (LTV/CAC, ARPU, retention) cleanly where analytic formulas are awkward. In ecommerce, bootstrap CIs for revenue, AOV, and complex funnel conversion rates are the workhorse. In lead gen, bootstrap is ideal for multi-stage funnel conversion where each stage compounds uncertainty non-trivially.

The Behavioral Science Connection

Bootstrap replaces dubious normality assumptions with data-driven empirics. It forces analysts to be humble about what they know — the data speaks, not a textbook formula. In stakeholder communication, "we resampled your data 10,000 times" is also more intuitive than "we invoked the central limit theorem."

Key Takeaway

When your statistic is anything more complex than a simple mean, or your data is skewed, bootstrap confidence intervals are the right default. They are more work computationally and zero work intellectually — the rigor comes built in.

← Browse All Terms