Overview
Syllabus
Intro
There Are No Safe Changes
Minimize the Blast Radius on Changes
Monitor Accurately and Measure Thoroughly
Automate Mitigations
Degraded Service Modes, or An Imperfect Experience Usually Beats a Nonexistent One
Use Functional Gates Pre-, Post- and During Releases
Design to Meet SLAs and Mitigate Incidents Quickly
Regularly Exercise All Processes and Tools
Enforce Processes with Technology
Redirect or Drop Traffic Aggressively During Incidents
Production Quality Tools
Sanitize and verify Inputs
Understand All of the Scenarios You Support
Transition Service Responsibilities Carefully
Taught by
USENIX