Overview
Explore a 19-minute conference talk from NSDI '22 that delves into Microsoft's BLASTSHIELD, a decentralized wide-area network (WAN) traffic engineering system. Learn how this innovative approach slices cloud networks into smaller fault domains, each managed by an independent controller, to maximize global network throughput without central coordination. Discover how BLASTSHIELD achieves comparable performance to centralized controllers while significantly reducing traffic loss from failures by 60%. Gain insights into the system's design assumptions, inter-slice routing, blast ripple prevention, and traffic engineering scheduler. Understand the benefits of decentralized cloud network management in containing the blast radius of faults and improving overall network resilience.
Syllabus
Intro
Software-driven WAN
SWAN traffic
SWAN outage of global scope
BlastShield slices
Design assumptions
BlastShield controller
Inter-slice routing
Blast ripple and routing loops
Source routing
Traffic engineering scheduler
Symphony or cacophony
Blast radius reduction
Summary
Taught by
USENIX