Explore advanced isolation techniques for large-scale distributed systems in this 16-minute conference talk from SREcon24 Europe/Middle East/Africa. Dive into strategies for reducing failure impacts through spatial and temporal containment methods, presented by Linhua Tang from Huawei Ireland Research Centre. Learn how cell-based architecture can compartmentalize failures to prevent cascading effects, while discovering temporal mitigation approaches focused on rapid recovery and self-healing mechanisms. Understand the application of formal methods in verifying design robustness and ensuring system reliability. Master proactive architectural planning techniques and continuous verification methods essential for maintaining stability in complex distributed environments.
Overview
Syllabus
SREcon24 Europe/Middle East/Africa - Blast Radius Reduction for Large-Scale Distributed Systems
Taught by
USENIX