Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

GPU Accelerated Data Curation for Large Language Models

MLOps.community via YouTube

Overview

Explore GPU-accelerated data curation techniques for large language models in this 30-minute talk by Ryan Wolf, a Deep Learning Algorithm Engineer at NVIDIA. Learn about the importance of well-curated datasets in scaling LLMs and discover how to create high-quality datasets using NeMo Curator, an open-source library for GPU-accelerated data curation. Gain insights into scaling datasets to trillions of tokens efficiently, a crucial yet often overlooked aspect of machine learning. Benefit from Ryan's expertise in AI systems and his current focus on developing NeMo Curator. This MLOps.community presentation, part of the DE4AI series, offers valuable knowledge for those interested in advancing their understanding of data curation for foundation models.

Syllabus

GPU Accelerated Data Curation for LLMs // Ryan Wolf // DE4AI

Taught by

MLOps.community

Reviews

Start your review of GPU Accelerated Data Curation for Large Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.