Overview
Explore the intricacies of resilience in modern software systems through this insightful conference talk by Courtney Nash, Senior Research Analyst at Verica. Delve into the findings from over 10,000 incident reports collected across two years of rigorous research. Gain counterintuitive insights on common resilience metrics and their true implications for Site Reliability Engineers, developers, and tech managers. Discover the surprising truths behind system architecture, incident management, and continuous improvement revealed through the Verica Open Incident Database (VOID). Learn how to transform your approach to resilience in complex, distributed systems that demand 24/7 availability. Challenge your understanding of resilience metrics and explore alternative measures beyond Mean Time to Recovery (MTTR). Equip yourself with valuable knowledge to navigate the adaptive universe of modern software systems and enhance your incident response strategies.
Syllabus
Intro
Resilience in 3 acts
Act 1: How we talk
Act 2: How we measure
Act 3: If not MTTR...?
Outro
Taught by
GOTO Conferences