Completed
- Including Human Feedback with Reward Models & RL
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Learning to Summarize from Human Feedback
Automatically move to the next video in the Classroom when playback concludes
- 1 - Intro & Overview
- 2 - Summarization as a Task
- 3 - Problems with the ROUGE Metric
- 4 - Training Supervised Models
- 5 - Main Results
- 6 - Including Human Feedback with Reward Models & RL
- 7 - The Unknown Effect of Better Data
- 8 - KL Constraint & Connection to Adversarial Examples
- 9 - More Results
- 10 - Understanding the Reward Model
- 11 - Limitations & Broader Impact