Explore a groundbreaking performance testing platform called ServiceLab in this award-winning conference talk from OSDI '24. Delve into the challenges of detecting minuscule performance regressions as small as 0.01% in a hyperscale environment. Learn how Meta tackles the complexities of testing diverse applications and ML models consuming millions of machines in production. Discover the innovative statistical analysis methods employed to robustly identify small regressions in noisy cloud environments. Gain insights from a large-scale study involving millions of performance experiments that identify machine factors affecting test result variance. Benefit from seven years of operational experience shared by Meta in managing a wide array of applications at scale.
Overview
Syllabus
OSDI '24 - ServiceLab: Preventing Tiny Performance Regressions at Hyperscale through...
Taught by
USENIX