Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

LLMOps: LLMs Memory and Compute Optimizations

The Machine Learning Engineer via YouTube

Overview

Explore FlashAttention and GQA techniques to enhance efficiency in self-attention layers, and discover FSDP and DDP methods for training and fine-tuning Large Language Models (LLMs) in this 24-minute tutorial. Gain practical insights into memory and compute optimizations for LLMs, with access to a comprehensive PowerPoint presentation and hands-on Jupyter notebook for implementation.

Syllabus

LLMOps: LLMs Memory and Compute Optimizations #machinelearning #datascience

Taught by

The Machine Learning Engineer

Reviews

Start your review of LLMOps: LLMs Memory and Compute Optimizations

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.