Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a groundbreaking approach to load balancing in distributed multi-tenant systems with PReQuaL (Probing to Reduce Queuing and Latency). This 19-minute conference talk from NSDI '24 introduces a novel load balancer designed to minimize real-time request latency in environments with heterogeneous server capacities and non-uniform, time-varying antagonist load. Discover how PReQuaL challenges conventional wisdom by focusing on estimated latency and active requests-in-flight rather than balancing CPU load. Learn about its active server load probing technique and the extension of the power of d choices paradigm with asynchronous and reusable probes. Gain insights into PReQuaL's implementation at YouTube, where it has significantly reduced tail latency, error rates, and resource usage, enabling higher system utilization across Google's production systems.