PM A/B Testing
(2026 Edition)
5 essentials and 5 traps for statistically honest A/B testing.
Build A/B Testing Skills โ Free โ5 Essentials
Hypothesis before data โ commit to what you expect and what success means
Minimum detectable effect (MDE) โ the smallest effect worth caring about
Sample size calculation โ run until powered, not until significant
Guardrail metrics โ watch retention, crash rate, while optimising your target
Novelty and primacy effects โ new features win early; wait for steady state
5 Traps
Peeking at results before the test ends โ inflates false positive rate
Running too many tests at once โ interactions poison conclusions
Ignoring negative secondary effects โ a 'winner' on CTR may lose on retention
P-hacking โ slicing data until something is significant
Shipping wins without understanding why โ statistical wins without causal stories don't compound
FAQ
What's the right significance threshold for PM A/B tests?
95% is convention but not sacred. For low-risk reversible changes, 80โ90% is often reasonable. For high-risk or irreversible changes, aim for 95%+ and run guardrail tests. The honest question isn't 'is this significant?' but 'am I confident enough to ship given the downside?'