Explore how organizations can leverage incidents and failures to build more resilient systems in this 29-minute conference talk from YOW! 2019. Delve into Seek's approach to handling "Normal Accidents" and learn how incident analysis and post-mortem rituals can lead to viewing diverse software stacks as socio-technical systems. Discover the importance of appreciating human factors in incidents and how involving technology teams in incident investigation can yield richer data for continuous improvement. Gain insights on avoiding common pitfalls such as obsessing over root causes and the limitations of the "5 Why's" technique. Understand how embracing failure in increasingly diverse and complex systems can become a competitive advantage in the DevOps era.
Overview
Syllabus
Learning from Incidents • Andrew Hatch • YOW! 2019
Taught by
GOTO Conferences