Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Canarying Well - Lessons Learned from Canarying Large Populations

USENIX via YouTube

Overview

Explore the intricacies of canarying in production environments through this insightful conference talk from SREcon18 Europe. Delve into common pitfalls, best practices, and a comprehensive end-to-end strategy for implementing effective canary processes. Learn from Google's Štěpán Davidovič as he shares valuable lessons on controlled rollouts to mitigate risks in large-scale systems. Gain a deeper understanding of canarying priorities, geographical distribution challenges, high variance scenarios, and bimodal distributions. Examine real-world examples involving service caches, memory leaks, and compound probabilities. Discover the importance of careful metric selection and analysis in ensuring successful canary deployments. Walk away with practical knowledge on implementing a robust three-step canary process to enhance the safety and reliability of your production changes.

Syllabus

Intro
Canarying: What is that?
What we're going to talk about
What we're not going to talk about
Conflicting Incentives
Triangle of Canarying Priorities
Example: Geographical distribution
Example: High variance among replicas
Example: Bimodal distribution
Example: Two metrics, different outliers
Takeaways 2
Example: Service With Cache, restarted
Example: Memory leak canary
Example: Before/after test
Example Takeaway
Example: Compound probability
Beware Meta Analysis
Prefer Few Metrics
Canary In These 3 Simple Steps
Canary In These 3-ish Simple Steps

Taught by

USENIX

Reviews

Start your review of Canarying Well - Lessons Learned from Canarying Large Populations

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.