Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Ten Persistent SRE Antipatterns - Pitfalls on the Road to a Successful SRE Program

USENIX via YouTube

Overview

Explore ten persistent antipatterns in Site Reliability Engineering (SRE) through this 54-minute conference talk from SREcon17 Americas. Discover common pitfalls organizations face when implementing SRE practices, including misconceptions about monitoring, incident response, configuration management, and automation. Learn how Google and Netflix approach the SRE role and why it differs from traditional systems administration. Gain insights into the importance of freedom, responsibility, trust, and controlled chaos in successful SRE programs. Understand how to avoid negative impacts on operations and empower teams to accomplish their mission effectively.

Syllabus

Intro
Launch Status Check
Service Outages
Host Alerts
What Makes a Good Alert
Noise Floor
SRE Burnout
War Rooms
Sharing
SpaceX
Reliability Theater
Incident Response
Monitoring
Virtualized Servers as Cattle
Containers vs Cattle
Configuration Management
Immutable Infrastructure
Configuration Management doesnt scale
Automation doesnt scale
Centralized tools
Automation
Design Systems
Automating
Burnout Team
Feature Releases
Embedded SME
Production Ready Checklist
Periodic Revisiting
Integrations
Uptime
Risk vs Reward
Dad Jokes
The Linkage
Chaos Monkey
Complex Systems
Real Stories
Interview
Perception

Taught by

USENIX

Reviews

Start your review of Ten Persistent SRE Antipatterns - Pitfalls on the Road to a Successful SRE Program

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.