Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for building AI infrastructure on virtualized environments in this KVM Forum presentation. Delve into the complexities of cloud computing for AI applications, focusing on heterogeneous computing and its unique requirements. Examine two key issues: the performance degradation in PCIe P2P communication between GPUs or GPUs and RDMA NICs due to IOMMU, and the limitations of traditional PMU virtualization for high-precision monitoring. Discover proposed solutions, including techniques to avoid P2P TLB redirection to IOMMU and methods for passthrough of core and uncore PMUs to guest systems. Learn how these approaches aim to narrow the gap between virtualized and bare-metal environments for AI infrastructure, presented by ByteDance virtualization experts Xin He and Hao Hong.
Syllabus
The Challenges of building AI Infra on virtualization by Xin He & Hao Hong
Taught by
KVM Forum