Skip to main content
← Glossary · Statistics & Methodology

Normal Distribution

A symmetric, bell-shaped probability distribution defined by its mean and standard deviation, where approximately 68% of values fall within one standard deviation of the mean and 95% within two.

What Is the Normal Distribution?

The normal distribution is a symmetric bell-shaped curve fully described by two numbers: a mean (the center) and a standard deviation (the spread). It shows up everywhere in nature and statistics, from human heights to measurement error, and it is the mathematical backbone of most A/B testing significance calculations.

Also Known As

  • Data science teams: Gaussian distribution, N(mu, sigma^2)
  • Growth teams: bell curve
  • Marketing teams: the normal curve
  • Engineering teams: Gaussian, standard normal (when mean=0, sd=1)

How It Works

Imagine plotting the conversion rates from 1,000 hypothetical reruns of an A/B test with 10,000 visitors per variant. The histogram would form a bell curve centered at the true conversion rate. If the standard deviation of that curve is 0.2%, then roughly 680 of those 1,000 reruns would land within 0.2% of the true value, 950 within 0.4%, and 997 within 0.6%. That 68-95-99.7 rule is the reason a 95% confidence interval uses roughly plus-or-minus two standard errors.

Best Practices

  • Do use normal-based tests when sample means are large and approximately symmetric.
  • Do check histograms or Q-Q plots before trusting normality for small samples.
  • Do use log transformations for skewed revenue data before applying normal-based methods.
  • Do not assume your raw data is normal just because the sampling distribution of the mean is.
  • Do not apply normal approximations to proportions below 5% without verifying sample-size adequacy.

Common Mistakes

  • Treating heavy-tailed revenue data as normal and underestimating variance.
  • Using z-scores on ordinal or ranked data where normal assumptions fail.
  • Ignoring that normality is about the estimator, not the raw metric.

Industry Context

  • SaaS/B2B: Low conversion rates force extra care; normal approximations to binomials can fail below 50 events.
  • Ecommerce/DTC: Conversion rate metrics tend to be well-approximated by normal at scale; revenue metrics are not.
  • Lead gen/services: Small-sample tests often fall back to t-distributions rather than pure normal.

The Behavioral Science Connection

The normal distribution underlies a cognitive shortcut: we assume "typical" outcomes cluster around an average, and we treat extreme outcomes as warning signs. This is reasonable for genuinely normal phenomena but dangerous for power-law distributions (viral content, revenue-per-user) where the mean is a poor description of the typical case. Kahneman calls this mistaking "Mediocristan" for "Extremistan."

Key Takeaway

The normal distribution is the scaffolding of A/B testing math, but you must verify it applies to your estimator, not assume it describes your raw data.