Completed
Results: Comparing SFT and ORPO with gsm8k, arithmetic and mmlu
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Combined Preference and Supervised Fine-Tuning with ORPO
Automatically move to the next video in the Classroom when playback concludes
- 1 Preference and Supervised Fine-tuning at the Same Time!
- 2 A short history of fine-tuning methods
- 3 Video Overview/Agenda
- 4 Difference between Unsupervised, Supervised and Preferences
- 5 Understanding cross-entropy and odds ratio loss functions
- 6 Why preference fine-tuning improves performance
- 7 Notebook demo of SFT and ORPO
- 8 Evaluation with lm-evaluation-harness
- 9 Results: Comparing SFT and ORPO with gsm8k, arithmetic and mmlu
- 10 Evaluation with Carlini's practical benchmark
- 11 Is it worth doing ORPO? Yes!