Overview
Explore an innovative analytical method for high-fidelity insights in analyzing and diagnosing distributed systems in this SREcon22 Americas conference talk. Delve into the problem domain, statistical methods, and intuition behind an approach successfully used in production for complex services at scale. Gain an alternative perspective on performance analysis and understand potential pitfalls. Learn from Google SREs Narayan Desai and Brent Bryan as they share their experiences with principled performance analytics, covering topics such as reliability, SLOs, qualitative vs quantitative analysis, high-level modeling, and real-world applications in error detection and diagnosis.
Syllabus
Introduction
Is it working
Reliability
The Problem with SLOs
A Simple Example
How Does This Play Out
The Prize
Errors
Qualitative vs Quantitative
Performance Analytics Example
High Level Model
Two Sigma
Building Detection
FAQ
Real Data
Diagnosis
Excursions
Decoration
Taught by
USENIX