Overview
Learn how to improve system reliability and user experience through a comprehensive conference talk that explores Service Level Objectives (SLOs) and Chaos Engineering. Discover why reliability should be a team-wide responsibility rather than solely an SRE team concern, and understand the fundamental concepts of SLOs, their organizational importance, and implementation strategies. Explore practical approaches to Chaos Engineering, including compliance and resilience best practices, while learning to effectively utilize error budgets to enhance system reliability. Through real-world examples presented by AWS Senior Developer Advocate Julie Gunderson and PagerDuty DevOps Advocate Mandi Walls, gain insights into applying SLIs, SLOs, and chaos engineering principles to delight users and reduce production incidents.
Syllabus
Reducing Trauma in Production with SLOs and Chaos Engineering | Julie Gunderson & Mandi Walls
Taught by
DevOpsDays Tel Aviv