Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Demystifying Data Curation for Pretrained Language Models

UofU Data Science via YouTube

Overview

Explore a detailed guest lecture that demystifies the complex process of data curation for pretrained language models, delivered by expert Kylo Lo at the University of Utah Data Science department. Gain valuable insights into the methodologies and best practices of preparing and organizing data sets specifically designed for training large language models. Learn about the critical considerations, challenges, and solutions in data curation that directly impact model performance and reliability. Discover practical approaches to data selection, cleaning, and preprocessing through this comprehensive 47-minute presentation that begins with a brief introduction before diving into the core technical content.

Syllabus

Start
Lecture starts

Taught by

UofU Data Science

Reviews

Start your review of Demystifying Data Curation for Pretrained Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.