Best Practices for Liquid and Air Cooling of a 51.2Tbps Switch for High-Density AI Clusters
Open Compute Project via YouTube
Overview
Explore thermal management strategies for high-density AI clusters in this technical presentation from Alibaba and Broadcom experts. Dive into practical solutions for cooling 51.2Tbps AI switches through both air and cold plate cooling methods, examining design principles, test outcomes, and real-world deployment experiences. Learn about advancements in power efficiency across switch silicon generations and improvements in xPU interconnect technologies. Gain valuable insights into thermal design approaches and hands-on experiences crucial for developing and maintaining high-performance AI computing infrastructure, with a particular focus on managing increased power density and heat dissipation challenges in modern AI training and inference environments.
Syllabus
Best Practices for Liquid & Air Cooling of a 51.2Tbps Switch for High-Density AI Clusters
Taught by
Open Compute Project