Design for Failure with Graceful Degradation
This standard mandates designing systems for failure with graceful degradation to ensure functionality even when components fail.
1. Design for Failure with Graceful Degradation:
Ensure systems remain functional even when components fail. This approach ensures resilience and maintains user experience during outages.
- 1.1 Failure Mitigation Mechanisms:
- 1.1.1 Circuit Breakers and Retries:
- Implement circuit breakers, retries, and failover mechanisms.
- Automate the configuration of circuit breakers.
- 1.1.2 Failover Implementation:
- Automate the implementation of failover mechanisms.
- Implement failover tracking.
- 1.2 Fallback Strategies:
- 1.2.1 Cached Content Serving:
- Use fallback strategies (e.g., serving cached content when a service is unavailable).
- Automate the management of cached content.
- 1.2.2 Strategy Implementation:
- Automate the implementation of fallback strategies.
- Implement strategy feedback collection.
- 1.3 Dependent Service Isolation:
- 1.3.1 Outage Prevention:
- Ensure dependent services do not cause a full system outage.
- Automate the monitoring of dependent service impact.
- 1.3.2 Isolation Management:
- Automate the isolation of dependent services.
- Implement isolation tutorials.
By designing for failure with graceful degradation, organisations can ensure system resilience and maintain user experience.