Reinforcement Learning with Human Feedback

Overview

Explore the intricacies of Reinforcement Learning with Human Feedback (RLHF) in this 36-minute talk by Luis Serrano, PhD, a renowned Machine Learning scientist and educator. Delve into the crucial role of human evaluations in fine-tuning Large Language Models and enhancing text generation capabilities. Gain insights from Serrano's extensive experience at tech giants like Google, Apple, and Cohere as he breaks down complex RLHF concepts in natural language processing and AI training. Learn about the evolution of Large Language Models (Transformers), fine-tuning techniques with RLHF, and get a quick introduction to reinforcement learning. Discover the PPO and DPO techniques used in AI training. Benefit from Serrano's journey from mathematics researcher to AI expert, drawing on his contributions to platforms like Coursera, Udacity, and DeepLearning.ai. Understand how reinforcement learning is revolutionizing AI interactions, making models more intelligent and responsive to human needs.

Syllabus

- Introduction
- Large Language ModelsTransformers
- How to fine-tune them with RLHF
- Quick intro to reinforcement learning
- PPO Technique
- DPO Technique