Overview
Syllabus
Welcome and Link to Colab Notebook
Encoder versus Decoder Architectures
What is the GPT-4o architecture?
Recap of transformer for weather prediction
Pre layer norm versus post layer norm
RoPE vs Sinusoidal Positional Embeddings
Dummy Data Generation
Transformer Architecture Initialisation
Forward pass test
Training loop setup and test on dummy data
Weather data import
Training and Results Visualisation
Can the model predict the weather?
Is volatility in the loss graph a problem?
How to improve the model further?
Taught by
Trelis Research