LLM과 생성형 AI 워크로드를 위한 대규모 AI 데이터센터 엔지니어링 기술

Overview

Learn about large-scale AI datacenter engineering techniques for LLM and generative AI workloads in this technical conference talk from SK AI SUMMIT 2024. Explore how Backend.AI platform optimizes high-performance hardware utilization through software layers, automating multi-GPU and multi-node setups while dynamically provisioning containers to improve resource efficiency. Discover methods for deploying and serving various language and multimodal models with different parameter sizes and quantization levels on a single infrastructure at optimal cost, incorporating I/O acceleration technologies like GPUDirect Storage through container-based GPU virtualization. Presented by Joongi Kim, the lead developer of Backend.AI at Lablup, who brings extensive experience in distributed systems and GPU-accelerated computing, along with contributions to CPython and the asyncio ecosystem.

Syllabus

LLM과 생성형 AI 워크로드를 위한 대규모 AI 데이터센터 엔지니어링 기술 | 래블업 김준기

Taught by

SK AI SUMMIT 2024

Reviews

Start your review of LLM과 생성형 AI 워크로드를 위한 대규모 AI 데이터센터 엔지니어링 기술

Taught by

고성능, 고효율 AI Computing 환경을 위한 차세대 GPU Fabric 및 NVLink 가상화 기술

Confidential Containers for GPU Compute - Incorporating LLMs in AI Lift-and-Shift Strategy

AI DC 운영 최적화를 위한 AWS 컨테이너 솔루션 및 사례 소개

Never Stop Learning.