Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Learning to Summarize from Human Feedback

Yannic Kilcher via YouTube

Overview

Explore an in-depth analysis of OpenAI's paper on improving text summarization through human feedback in this 46-minute video. Dive into the challenges of training and evaluating summarization models, learn about the limitations of traditional metrics like ROUGE, and discover how incorporating direct human feedback can significantly enhance summary quality. Examine the novel approach of using reward model proxies and reinforcement learning to train models that outperform single humans in summary generation. Follow along as the video breaks down key concepts, methodologies, and results, including the application to Reddit posts and transfer to news articles. Gain insights into the broader implications of this research for machine learning and natural language processing.

Syllabus

- Intro & Overview
- Summarization as a Task
- Problems with the ROUGE Metric
- Training Supervised Models
- Main Results
- Including Human Feedback with Reward Models & RL
- The Unknown Effect of Better Data
- KL Constraint & Connection to Adversarial Examples
- More Results
- Understanding the Reward Model
- Limitations & Broader Impact

Taught by

Yannic Kilcher

Reviews

Start your review of Learning to Summarize from Human Feedback

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.