Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

The Pitfalls of Next-token Prediction in Language Models

Simons Institute via YouTube

Overview

Explore a thought-provoking lecture that delves into the limitations of next-token prediction in modeling human intelligence. Examine the critical distinction between autoregressive inference and teacher-forced training in language models. Discover why the popular criticism of error compounding during autoregressive inference may overlook a more fundamental issue: the potential failure of teacher-forcing to learn accurate next-token predictors for certain task classes. Investigate a general mechanism of teacher-forcing failure and analyze empirical evidence from a minimal planning task where both Transformer and Mamba architectures struggle. Consider the potential benefits of training models to predict multiple tokens in advance as a possible solution. Gain insights that can inform future debates and inspire research beyond the current next-token prediction paradigm in artificial intelligence.

Syllabus

The Pitfalls of Next-token Prediction

Taught by

Simons Institute

Reviews

Start your review of The Pitfalls of Next-token Prediction in Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.