Large Model Training and Inference with DeepSpeed

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore the journey of DeepSpeed and its transformative impact on large model training and inference in this 36-minute conference talk by Samyam Rajbhandari at the LLMs in Prod Conference. Discover how technologies like ZeRO and 3D-Parallelism have become fundamental building blocks for training large language models at scale, powering LLMs such as Bloom-176B and Megatron-Turing 530B. Learn about heterogeneous memory training systems like ZeRO-Offload and ZeRO-Infinity, which have democratized LLMs by making them accessible with limited resources. Gain insights into DeepSpeed-Inference and DeepSpeed-MII, which simplify the application of powerful inference optimizations to accelerate LLMs for deployment. Understand how DeepSpeed has been integrated into platforms like HuggingFace, PyTorch Lightning, and Mosaic ML, and how its technologies are offered in PyTorch, Colossal-AI, and Megatron-LM. Delve into the motivations, insights, and stories behind the development of these groundbreaking technologies that have revolutionized large language model training and inference.

Syllabus

Large Model Training and Inference with DeepSpeed // Samyam Rajbhandari // LLMs in Prod Conference

Taught by

MLOps.community

Reviews

Start your review of Large Model Training and Inference with DeepSpeed

Taught by

Generative AI with Large Language Models

Rust for Large Language Model Operations (LLMOps)

Ultimate Guide to Scaling ML Models - Megatron-LM - ZeRO - DeepSpeed - Mixed Precision

Efficiently Serving LLMs

Running BLOOM 176B LLM Inference with AWS ML and DeepSpeed

MegaScale - Scaling Large Language Model Training to More Than 10,000 GPUs

Never Stop Learning.