Explore a 45-minute conference talk from SNIA SDC 2024 that delves into optimizing GPU server access to network-attached disaggregated storage through DPU technology. Learn about the transformative impact of AI on data center storage architectures and the challenges faced by CPU-centric servers in handling massive data processing requirements. Discover how AMD and MangoBoost address performance bottlenecks through DPU acceleration of infrastructure functions, including networking, storage, and GPU peer-to-peer communications. Gain insights into AI trends like Large Language Models (LLMs), examine AMD's GPU systems, and understand DPU's role in enhancing storage server access efficiency. Study real-world case studies featuring AMD MI300X GPU servers utilizing open-source ROCm software and MangoBoost DPU's GPU-storage-boost technology, demonstrating improved performance through NVME-over-TCP and peer-to-peer communications. Master the latest developments in AI infrastructure optimization while understanding how DPUs can enhance GPU efficiency and accelerate AI model learning processes.
Overview
Syllabus
SNIA SDC 2024 - Accelerating GPU Server Access to Network-Attached Disaggregated Storage Using DPU
Taught by
SNIAVideo