Practice : A/B Testing
Purpose and Strategic Importance
A/B Testing is the practice of running controlled experiments to compare two or more variations of a feature or experience. It enables teams to validate hypotheses with real user data, reducing risk and increasing confidence in product decisions.
By relying on data over intuition, A/B testing supports safer, more effective releases. It encourages experimentation, accelerates learning, and ensures that engineering effort leads to measurable outcomes.
Description of the Practice
- Two (or more) variants of a feature are released to distinct user segments.
- Performance is measured using predefined success metrics (e.g. conversion, latency, engagement).
- Traffic allocation is controlled (e.g. 50/50 split or incremental rollout).
- Tests are statistically validated before concluding outcomes or rolling out fully.
- Results guide decisions to ship, iterate, or abandon the change.
How to Practise It (Playbook)
1. Getting Started
- Identify a feature or experience you want to validate.
- Define clear success criteria aligned to business or user goals.
- Use an experimentation platform (e.g. LaunchDarkly, Optimizely, custom tooling) to configure test groups.
- Split traffic and monitor impact across key metrics.
2. Scaling and Maturing
- Run tests across platforms (web, mobile, backend) with consistent frameworks.
- Integrate experimentation into your delivery pipeline - testing is part of release.
- Use progressive rollout strategies (e.g. ramp-ups, holdouts, kill switches).
- Build dashboards and analysis tools to help interpret results confidently.
- Create a central playbook or registry for experiments to share learnings.
3. Team Behaviours to Encourage
- Hypothesise first - avoid launching “just to see what happens.”
- Share and discuss results with product, design, and engineering.
- Run tests even on ideas that feel “obvious” - validate assumptions.
- Align on what “success” means before releasing changes to all users.
4. Watch Out For…
- Drawing conclusions from incomplete or noisy data.
- Testing too many variables at once without isolation.
- Continuing failed experiments without clear justification.
- Allowing product decisions to bypass experimentation pipelines.
5. Signals of Success
- Product decisions are based on validated data, not opinion.
- Teams run multiple experiments per release cycle.
- Fewer “big bang” launches - features are introduced iteratively and safely.
- The cost of failure is reduced through controlled exposure.
- Experimentation is seen as part of delivery, not an optional add-on.