How to Build an LLM from Scratch - An Overview

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Dive into a comprehensive 36-minute video tutorial on building Large Language Models (LLMs) from scratch. Explore key aspects of developing foundation LLMs based on models like GPT-3, Llama, and Falcon. Learn about the four crucial steps: data curation, model architecture, training at scale, and evaluation. Discover data sources, diversity, and preparation techniques. Understand transformer architectures, design choices, and model sizing. Gain insights into training stability, hyperparameter tuning, and various evaluation methods for both multiple-choice and open-ended tasks. Access numerous resources and references to deepen your understanding of LLM development.

Syllabus

Intro -
How much does it cost? -
4 Key Steps -
Step 1: Data Curation -
1.1: Data Sources -
1.2: Data Diversity -
1.3: Data Preparation -
Step 2: Model Architecture Transformers -
2.1: 3 Types of Transformers -
2.2: Other Design Choices -
2.3: How big do I make it? -
Step 3: Training at Scale -
3.1: Training Stability -
3.2: Hyperparameters -
Step 4: Evaluation -
4.1: Multiple-choice Tasks -
4.2: Open-ended Tasks -
What's next? -