Completed
Introduction
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Reinforcement Learning from Human Feedback - From Zero to ChatGPT
Automatically move to the next video in the Classroom when playback concludes
- 1 Introduction
- 2 Recent breakthroughs
- 3 What is RL
- 4 History of RL
- 5 Example of RL
- 6 ChatGPT
- 7 Technical details
- 8 Three conceptual parts
- 9 NLP Pretraining
- 10 Supervised Finetuning
- 11 Reward Model Training
- 12 Input and Output Pairs
- 13 Reward Model
- 14 KL Divergence
- 15 Scaling Factor
- 16 RL Optimizer
- 17 PPO
- 18 Conceptual Questions
- 19 Prompts and Responses
- 20 anthropics
- 21 blenderbot
- 22 thumbs up and thumbs down
- 23 chatGPT example
- 24 chatGPT vsanthropic
- 25 Open areas of investigation
- 26 Wrap up
- 27 Q A
- 28 Open Source Community
- 29 Reinforcement Learning from Email
- 30 Paper Release