Reinforcement Learning with Human Feedback - Understanding LLM Fine-tuning with PPO and DPO

Open Data Science via YouTube Direct link

- How to fine-tune them with RLHF

3

of 6

3 of 6

- How to fine-tune them with RLHF

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Reinforcement Learning with Human Feedback - Understanding LLM Fine-tuning with PPO and DPO