Learn about Meta's hardware modifications to the Grand Teton OCP platform in this technical presentation that explores the integration of MTIA and PCIe CEM accelerators. Discover how existing components like chassis, motherboard, management board, and power distribution boards are leveraged while implementing new features to support MTIA inference accelerators. Explore the design of a new carrier card tray that accommodates 12 220W PCIe accelerator modules with LPDDR5 memory, featuring dual PCIe switches, sensor monitoring, firmware management, fault detection, and USB/UART debug capabilities. Gain insights into the optional modular memory/storage expansion tray that enables host memory expansion through CXL-based DDR4 modules and E1.S data drives, along with understanding the overall rack configuration and accelerator module hardware design fundamentals.
Overview
Syllabus
Supporting Meta ML Accelerators on the Grand Teton Platform
Taught by
Open Compute Project