Explore the basics of site reliability engineering for DevOps. Learn SRE techniques for release, change and incident management, self-service automation, and more.
Overview
Syllabus
Introduction
- Reliability engineering basics
- What you should know
- Your job as a DevOp
- You aren't Google or Netflix
- Release engineering
- Change management
- Self-service automation
- SLAs and SLOs
- Incident management
- Introducing postmortems
- The postmortem process
- Troubleshooting
- Performance engineering
- Capacity and scalability
- Distributed design
- Deliberate adversity
- Organizing SREs
- The softer side of SRE
- Next steps
Taught by
James Wickett and Ernest Mueller