Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Thunderbolt - Throughput-Optimized, Quality-of-Service-Aware Power Capping at Scale

USENIX via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a conference talk on Thunderbolt, a hardware-agnostic power capping system designed for hyperscale data centers. Learn about the challenges of power oversubscription and the need for task-level quality-of-service differentiation in modern compute clusters. Discover how Thunderbolt ensures safe power oversubscription while minimizing impact on both throughput-oriented and latency-sensitive tasks. Examine the system's architecture, mechanisms, and policies, including its two-threshold control policy and use of CPU bandwidth control. Understand the benefits of Thunderbolt's reactive and proactive capping approaches, and see real-world deployment results in production clusters. Gain insights into power efficiency improvements and the potential for significant power oversubscription gains in data center environments.

Syllabus

Intro
Motivation: power oversubscription and capping
Motivation: task QoS differentiation
Prior industry solutions did not meet our needs
Architecture
Mechanism and policy details
Why not RAPL or DVFS?
CPU bandwidth control, DVFS, RAPL on Intel Skylake CPU
Reactive capping policy: load shaping
Load shaping on a production cluster
Proactive capping mechanism: CPU jailing Deterministic machine CPU cap
20% CPU jailing on a production cluster
Proactive capping policy: risk assessment
Deployed in logs processing clusters
Summary

Taught by

USENIX

Reviews

Start your review of Thunderbolt - Throughput-Optimized, Quality-of-Service-Aware Power Capping at Scale

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.