Overview
Explore the evolution of Site Reliability Engineering (SRE) practices in this 33-minute conference talk from SREcon18 Asia/Australia. Delve into the journey of SRE implementation, examining various stages of skill acquisition and key practices. Learn about important signposts in areas such as incident prevention and handling, postmortems, KPI/SLOs, monitoring, and capacity management. Gain insights to evaluate your organization's current position on the SRE spectrum and plan for future advancements. Understand the concept of SRE as an ongoing journey, with practical examples and detailed examinations of exemplar values and practices. Use this knowledge to assess and improve your team's approach to site reliability, regardless of your background or company's current level of SRE implementation.
Syllabus
STAGES OF PRACTICE
Shu Signposts: Incident Response
Ha-Ri Signposts: Incident Response
Shu Signposts: Postmortems
Ha-Ri Signposts: Postmortems
Shu Signposts: Incident Prevention
Ha-Ri Signposts: Incident Prevention
Shu Signposts: Monitoring
Ha-Ri Signposts: Monitoring
Shu Signposts: Performance Management
Ha-Ri Signposts: Capacity Planning and Forecasting
Assessing Your Organization's Level of Practice
Each 9' will cost you more than the one before it...
THE ADVANCED COMPUTING SYSTEMS ASSOCIATION
Taught by
USENIX