Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How to Build an LLM from Scratch - An Overview

Shaw Talebi via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive 36-minute video tutorial on building Large Language Models (LLMs) from scratch. Explore key aspects of developing foundation LLMs based on models like GPT-3, Llama, and Falcon. Learn about the four crucial steps: data curation, model architecture, training at scale, and evaluation. Discover data sources, diversity, and preparation techniques. Understand transformer architectures, design choices, and model sizing. Gain insights into training stability, hyperparameter tuning, and various evaluation methods for both multiple-choice and open-ended tasks. Access numerous resources and references to deepen your understanding of LLM development.

Syllabus

Intro -
How much does it cost? -
4 Key Steps -
Step 1: Data Curation -
1.1: Data Sources -
1.2: Data Diversity -
1.3: Data Preparation -
Step 2: Model Architecture Transformers -
2.1: 3 Types of Transformers -
2.2: Other Design Choices -
2.3: How big do I make it? -
Step 3: Training at Scale -
3.1: Training Stability -
3.2: Hyperparameters -
Step 4: Evaluation -
4.1: Multiple-choice Tasks -
4.2: Open-ended Tasks -
What's next? -

Taught by

Shaw Talebi

Reviews

Start your review of How to Build an LLM from Scratch - An Overview

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.