Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Metastable Failures in the Wild

USENIX via YouTube

Overview

Explore a comprehensive analysis of metastable failures in distributed systems through this 16-minute conference talk from OSDI '22. Delve into the prevalence and impact of these failures across various organizations, from small companies to hyperscalers. Discover the extended model of metastable failures, including two types of triggers and amplification mechanisms. Learn about real-world examples and their implications for system design and reliability. Gain insights into the recurring patterns of metastable failures in major outages and understand their significance in the field of distributed systems. Examine the researchers' findings from studying 22 metastable failures across 11 different organizations, and explore their reproduced examples in controlled environments. Enhance your understanding of this critical issue in distributed systems and its potential solutions.

Syllabus

Intro
What are Metastable Failures?
Metastable Failures are Prevalent
Metastability in the Wild - Survey
Defining Metastability - System States
Survey Summary
Metastability Taxonomy - Trigger
Metastability Taxonomy - Sustaining ef
Four Metastability Scenarios Load-spike trigger
Degrees of Vulnerabilities
Lessons
Conclusion

Taught by

USENIX

Reviews

Start your review of Metastable Failures in the Wild

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.