Ragan McGill

Practice : A/B Testing

Purpose and Strategic Importance

A/B Testing is the practice of running controlled experiments to compare two or more variations of a feature or experience. It enables teams to validate hypotheses with real user data, reducing risk and increasing confidence in product decisions.

By relying on data over intuition, A/B testing supports safer, more effective releases. It encourages experimentation, accelerates learning, and ensures that engineering effort leads to measurable outcomes.

Description of the Practice

Two (or more) variants of a feature are released to distinct user segments.
Performance is measured using predefined success metrics (e.g. conversion, latency, engagement).
Traffic allocation is controlled (e.g. 50/50 split or incremental rollout).
Tests are statistically validated before concluding outcomes or rolling out fully.
Results guide decisions to ship, iterate, or abandon the change.

How to Practise It (Playbook)

1. Getting Started

Identify a feature or experience you want to validate.
Define clear success criteria aligned to business or user goals.
Use an experimentation platform (e.g. LaunchDarkly, Optimizely, custom tooling) to configure test groups.
Split traffic and monitor impact across key metrics.

2. Scaling and Maturing

Run tests across platforms (web, mobile, backend) with consistent frameworks.
Integrate experimentation into your delivery pipeline - testing is part of release.
Use progressive rollout strategies (e.g. ramp-ups, holdouts, kill switches).
Build dashboards and analysis tools to help interpret results confidently.
Create a central playbook or registry for experiments to share learnings.

3. Team Behaviours to Encourage

Hypothesise first - avoid launching “just to see what happens.”
Share and discuss results with product, design, and engineering.
Run tests even on ideas that feel “obvious” - validate assumptions.
Align on what “success” means before releasing changes to all users.

4. Watch Out For…

Drawing conclusions from incomplete or noisy data.
Testing too many variables at once without isolation.
Continuing failed experiments without clear justification.
Allowing product decisions to bypass experimentation pipelines.

5. Signals of Success

Product decisions are based on validated data, not opinion.
Teams run multiple experiments per release cycle.
Fewer “big bang” launches - features are introduced iteratively and safely.
The cost of failure is reduced through controlled exposure.
Experimentation is seen as part of delivery, not an optional add-on.