Explore a conference talk from GopherCon 2022 that delves into the implementation of real-time Adaptive Controls for enhancing system resiliency. Learn how CrowdStrike tackles the challenges of modern distributed systems with hundreds of tunables affecting service resilience. Discover the innovative approach inspired by TCP congestion control, which dynamically adjusts parameters based on real-time sampling of errors and latencies. Gain insights into the deployment of this feature in CrowdStrike's massive production systems, handling trillions of events daily without incidents. Understand the benefits of minimizing configuration surfaces in container workloads and the importance of dynamic parameter adjustment in overcoming operational toil and preventing system failures.
Overview
Syllabus
GopherCon 2022: Nathanial Murphy - Real-time Adaptive Controls for Increased Resiliency
Taught by
Gopher Academy