Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn about advanced performance monitoring techniques for ethernet in AI/ML clusters in this 17-minute conference talk presented by Jai Kumar from Broadcom and Rita Hui from Microsoft. Discover methods for monitoring and optimizing the SAI layer's performance to achieve optimal data plane performance and faster convergence in switch operations. Gain insights into addressing critical performance challenges that arise with increased ethernet adoption in AI/ML clusters, focusing on both data plane efficiency and programming interface optimization.
Syllabus
SAI APM:Advanced Performance Monitoring
Taught by
Open Compute Project