Effective Disaster Recovery - The Day We Deleted Production
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a real-world disaster recovery scenario in this 37-minute conference talk from KubeCon + CloudNativeCon. Learn how InfluxData accidentally deleted all compute from a busy production cluster, causing a multi-hour outage. Discover the events leading up to the incident, the recovery process, customer reactions, and implemented changes. Gain insights into CI/CD pipeline configurations and the specific change that triggered the outage. Examine the effectiveness of their disaster recovery plan, identifying successful elements and areas for improvement. Benefit from a blend of technical and management perspectives on handling critical infrastructure failures and implementing robust disaster recovery strategies.
Syllabus
Effective Disaster Recovery: The Day We Deleted Production - Rick Spencer & Wojciech Kocjan
Taught by
CNCF [Cloud Native Computing Foundation]