Free toolPostWyse

A/B Test Significance Calculator

Enter visitors and conversions for your control and variant. Get the relative uplift, statistical significance, p-value, confidence level, and a plain-English verdict on whether the result is real or just noise. No signup.

Control (A)

Conversion rate

5.00%

Variant (B)

Conversion rate

6.21%

Result

Significant — variant wins

At 99.0% confidence (p = 0.010), this result is unlikely to be chance. You can act on it.

Relative uplift

+24.2%

Confidence

99.0%

P-value

0.010

Z-score

2.57

Two-tailed pooled z-test for two proportions at a 95% significance threshold (p < 0.05). This measures whether the difference is real, not whether your sample is large enough — run a sample-size check before you start, and avoid peeking at the result and stopping the moment it crosses the line.

Don't ship a winner that was just luck

The most common A/B testing mistake is calling a result early. A variant jumps 18% on day two, everyone cheers, the test ends — and the “win” evaporates in production. With small samples, conversion rates swing wildly; significance testing exists to tell the signal from that noise.

This calculator answers one precise question: given the numbers you collected, how confident can you be that the variant genuinely differs from the control? It does not tell you whether you've run the test long enough — that's a sample-size question you should settle before launch. The two together keep you honest.

Reading the result

  • p < 0.05 (95%+): significant. The difference is unlikely to be chance — act on it.
  • 85–95%: trending. Promising but not proven; gather more data.
  • Below 85%: noise. Don't read anything into it yet.
Want this on autopilot

PostWyse drafts the variants, runs the test, and only surfaces a winner once it clears significance — so you act on real lifts, not lucky days.

Try PostWyse

Frequently asked

What is statistical significance in an A/B test?

Statistical significance is the probability that the difference between your control and variant is real rather than random chance. The convention is 95% confidence (a p-value below 0.05), meaning there's less than a 5% chance you'd see a difference this large if the two versions actually performed identically.

What is a p-value?

The p-value is the probability of observing a difference at least as extreme as yours if there were truly no difference between the variants. A p-value of 0.03 means a 3% chance the result is noise. Below 0.05 is the standard bar for declaring a winner.

How does this calculator work?

It runs a two-tailed pooled z-test for two proportions — the standard test for comparing conversion rates. It pools the two groups to estimate the shared conversion rate, computes the standard error, derives a z-score, and converts that to a p-value and confidence level.

Why shouldn't I stop a test as soon as it hits significance?

Because of 'peeking.' If you check repeatedly and stop the instant p dips below 0.05, you dramatically inflate your false-positive rate — random fluctuations will cross the line temporarily. Decide your sample size up front, run to it, then read the result once. This calculator tells you if a result is significant, not whether you've collected enough data.

How much traffic do I need for an A/B test?

It depends on your baseline conversion rate and the smallest uplift worth detecting — smaller effects need much larger samples. As a rough rule, low single-digit conversion rates often need thousands of visitors per variant to detect a 10–20% relative lift. Run a sample-size calculation before you launch.

More free tools

See all