Ragan McGill

Practice : Shadow Testing in Production

Purpose and Strategic Importance

Shadow Testing in Production involves replaying real-world production traffic to a new or updated service version without impacting live users. This allows teams to validate functionality, performance, and error handling under realistic conditions before formally releasing changes.

Shadow testing helps de-risk releases, uncover unexpected behaviours, and provide confidence in production readiness. It is a critical technique for progressive delivery and operational resilience.

Description of the Practice

Live traffic is duplicated and sent to both the current production system and the shadow (non-active) service.
Responses from the shadow system are recorded but not returned to users.
Comparison of shadow vs. production responses highlights regressions or unexpected side effects.
Metrics, logs, and traces are analysed to assess shadow service behaviour.
No state or side effects from the shadow system are persisted to production databases.

How to Practise It (Playbook)

1. Getting Started

Identify services where change carries significant operational or user risk.
Enable traffic mirroring or duplication through proxies, gateways, or load balancers.
Ensure that shadow services run in an isolated environment (e.g. separate database or readonly mode).
Compare logs, responses, and metrics to detect anomalies or regressions.

2. Scaling and Maturing

Automate shadow testing as part of canary or blue-green deployments.
Use diffing tools to compare request/response behaviours at scale.
Track shadow performance vs. production as part of release sign-off criteria.
Enable telemetry on shadow routes (e.g. Prometheus, OpenTelemetry, Datadog).
Extend shadow testing to include security, accessibility, and edge-case handling.

3. Team Behaviours to Encourage

Review shadow test results collaboratively before progressing changes.
Use learnings to improve test coverage, observability, and release gates.
Align shadow testing with product risk appetite and compliance needs.
Integrate shadow feedback loops into regular delivery cadence.

4. Watch Out For…

Shadow systems inadvertently writing to production systems.
Incomplete telemetry or low visibility of shadow results.
Performance impacts from traffic duplication without resource scaling.
Using shadow testing as a substitute for earlier test stages.

5. Signals of Success

Releases are validated safely using real-world traffic before going live.
Shadow results influence rollout decisions and uncover latent defects.
Shadowing is applied routinely for high-risk changes and new services.
Production systems experience fewer regressions post-release.
Confidence in operational readiness improves across teams.