Overview
This course focuses on understanding and defining reliability in platform engineering. By exploring how to detect platform-level latency regressions, measure their impact, and track performance over time, participants will learn to take a statistical approach to assess platform performance from a customer-centric perspective. The course covers topics such as end-to-end latency distribution, request delivery latency, reliability practices, impact analysis, and various techniques for diagnosing and testing platform performance. The intended audience for this course includes platform engineers, SREs, DevOps professionals, and anyone interested in improving platform reliability and performance.
Syllabus
Intro
Serverless platform is amazing
"My app is slow"
The platform is slow
Total end-to-end latency distribution
Request delivery latency
Goal
Reliability in practice
Applying to the model
Stationarity
2-Sigma Technique
Mechanics
Overload score
Impact analysis
FAQ
Backtesting
Limitations
Other applications
Streamlined diagnosis
Approximate cohort A/B testing
Conclusions
Outro
Taught by
GOTO Conferences