Overview
Learn about critical reliability engineering principles for liquid cooling systems in AI-accelerated computing environments through this 15-minute technical presentation. Explore component-level reliability analysis focusing on quick disconnect components, with insights into predictive modeling techniques using empirical data and artificial aging methods. Examine physics-based failure mechanisms within cooling loops, including detailed analysis of materials, environmental factors, and hydraulic and mechanical stresses. Discover how to ensure robust rack infrastructure and efficient data center operations when implementing liquid cooling technologies for high-performance computing systems where system reliability is paramount.
Syllabus
Liquid Cooling When Component Failure is Not an Option
Taught by
Open Compute Project