Optical Interconnects for Large-Scale AI Clusters - A Meta Perspective
Open Compute Project via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a technical presentation from Meta's Optical Engineer Andrew Alduino examining the critical role of optical interconnects in scaling AI infrastructure. Dive into the challenges of building large-scale GPU clusters like Meta's 24k GPU systems for LLama 3 training, focusing on how increasing AI workload demands impact IO requirements and accelerator package design. Learn about the limitations of electrical signaling solutions and discover how integrated optics offer promising alternatives with superior bandwidth capabilities. Understand the complex interplay between GPU hardware, system IO, rack design, power delivery, cooling technologies, and memory architectures in modern AI cluster development, with particular emphasis on optical interconnect optimization for next-generation AI infrastructure.
Syllabus
Optics in AI Clusters - Meta Perspective
Taught by
Open Compute Project