Machine Learning Infrastructure at Facebook Scale
MLOps World: Machine Learning in Production via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions in scaling machine learning infrastructure at Facebook in this 18-minute conference talk from MLOps World: Machine Learning in Production. Gain insights into how Facebook's AI Infrastructure team reimagined their entire stack to support rapidly growing ranking models serving over a billion users. Discover the approach taken to redesign and scale the infrastructure, including the creation of specialized hardware using powerful GPUs and network devices, and the development of optimized distributed training algorithms using PyTorch. Learn from Senior AI Infra Engineer Shivam Bharuka as he shares his experience in driving performance, reliability, and efficiency-oriented designs across Facebook's AI Infrastructure components.
Syllabus
Machine Learning Infrastructure at Facebook Scale
Taught by
MLOps World: Machine Learning in Production