Enter visitors and conversions for each variant. We compute statistical significance using a two-sample Z-test and tell you whether to trust the result.
Variant A (control)
Rate: 4.20%
Variant B (challenger)
Rate: 5.16%
Statistically significant (95%+ confidence)
97.7%
Confidence level
Lift (B vs A)
+22.86%
Z-score
2.27
Winner
B
Wait for significance — don't peek. Stopping early when B looks good is the #1 way to call false winners. Pre-commit to a sample size or minimum confidence threshold.
Run for a full business cycle. At least 1 full week, ideally 2. Weekday vs weekend traffic behaves differently. Cutting at 3 days catches one traffic pattern.
Test one thing at a time. If you change the headline AND the button color AND the hero image, a win doesn't tell you what caused it. Isolate variables.
Expect small lifts. Real winners usually lift 3-15%. Lifts of 50-100% from a single change are almost always noise or a broken implementation.
Sample size matters. With 100 visitors per variant, you need a huge effect to hit 95%. Aim for 1,000+ per variant minimum before drawing conclusions.
PageStrike starts you on a page that already beats industry benchmarks — less testing required. Free tier, unlimited pages.
Start free