Learn about large-scale AI datacenter engineering techniques for LLM and generative AI workloads in this technical conference talk from SK AI SUMMIT 2024. Explore how Backend.AI platform optimizes high-performance hardware utilization through software layers, automating multi-GPU and multi-node setups while dynamically provisioning containers to improve resource efficiency. Discover methods for deploying and serving various language and multimodal models with different parameter sizes and quantization levels on a single infrastructure at optimal cost, incorporating I/O acceleration technologies like GPUDirect Storage through container-based GPU virtualization. Presented by Joongi Kim, the lead developer of Backend.AI at Lablup, who brings extensive experience in distributed systems and GPU-accelerated computing, along with contributions to CPython and the asyncio ecosystem.
Overview
Syllabus
LLM과 생성형 AI 워크로드를 위한 대규모 AI 데이터센터 엔지니어링 기술 | 래블업 김준기
Taught by
SK AI SUMMIT 2024