Completed
Iterative Training
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Deep Dive Into Self-Rewarding Language Models - Training Models as Their Own Judges
Automatically move to the next video in the Classroom when playback concludes
- 1 What we’re covering
- 2 The Problem With Human-Labeled Data
- 3 Super-human Agents and Synthetic Data
- 4 What is a Self-Rewarding Language Model
- 5 Skill 1. Instruction Following
- 6 Skill 2. LLM-as-a-Judge
- 7 Prompting as the Judge
- 8 Initialization and Datasets
- 9 Self-Instruction Creation
- 10 AI Feedback Training Data Creation AIFT
- 11 Iterative Training
- 12 Evaluation
- 13 Results
- 14 Conclusion
- 15 Join us!