Reinforced Self-Training for Language Modeling - Paper Explained

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore a comprehensive video explanation of the Reinforced Self-Training (ReST) method for language modeling. Delve into how ReST utilizes a bootstrap-like approach to generate its own extended dataset, training on increasingly high-quality subsets to enhance its reward system. Understand the efficiency advantages of ReST compared to Online Reinforcement Learning techniques like PPO, including its ability to reuse generated data multiple times. Examine the paper's abstract, which outlines ReST's application in machine translation and its potential to significantly improve translation quality. Learn about the authors behind this innovative approach and their findings on ReST's compute and sample efficiency in improving large language models through alignment with human preferences.

Syllabus

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

Taught by

Yannic Kilcher

Reviews

Start your review of Reinforced Self-Training for Language Modeling - Paper Explained

Taught by

Generative AI Advance Fine-Tuning for LLMs

Generative AI Language Modeling with Transformers

Deep Learning for Natural Language Processing

Reinforced Active Learning for Image Segmentation

Direct Preference Optimization (DPO) vs RLHF - Understanding Language Model Training

Training BERT - Masked-Language Modeling

Never Stop Learning.