Completed
Pre layer norm versus post layer norm
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Transformers from Scratch - Part 2: Building and Training a Weather Prediction Model
Automatically move to the next video in the Classroom when playback concludes
- 1 Welcome and Link to Colab Notebook
- 2 Encoder versus Decoder Architectures
- 3 What is the GPT-4o architecture?
- 4 Recap of transformer for weather prediction
- 5 Pre layer norm versus post layer norm
- 6 RoPE vs Sinusoidal Positional Embeddings
- 7 Dummy Data Generation
- 8 Transformer Architecture Initialisation
- 9 Forward pass test
- 10 Training loop setup and test on dummy data
- 11 Weather data import
- 12 Training and Results Visualisation
- 13 Can the model predict the weather?
- 14 Is volatility in the loss graph a problem?
- 15 How to improve the model further?