Pretraining Task Diversity and the Emergence of Non-Bayesian In-Context Learning for Regression

Overview

Explore the fascinating phenomenon of in-context learning (ICL) in pretrained transformers through this insightful lecture by Surya Ganguli from Stanford University. Delve into the fundamental question of whether ICL can solve tasks significantly different from those encountered during pretraining. Examine the performance of ICL on linear regression while varying the diversity of tasks in the pretraining dataset. Discover the existence of a task diversity threshold for the emergence of ICL and its implications. Learn how transformers behave like Bayesian estimators below this threshold and outperform them beyond it, aligning with ridge regression. Understand the critical role of task diversity in enabling transformers to solve new tasks in-context, deviating from the Bayes optimal estimator. Gain valuable insights into the interplay between task diversity, data scale, and model scale in the emergence of ICL capabilities.

Syllabus

Pretraining Task Diversity and the Emergence of Non-Bayesian In-Context Learning for Regression

Taught by

Simons Institute

Reviews

Start your review of Pretraining Task Diversity and the Emergence of Non-Bayesian In-Context Learning for Regression

Taught by

Deep Learning

What Learning Algorithm is In-Context Learning? - Understanding Transformer Models and Neural Sequence Learning

Two Stories in Mechanistic Interpretation of Natural and Artificial Neural Computation

Learning Linear Models In-Context with Transformers

In-Context Learning: A Case Study of Simple Function Classes

In-Context Learning in Deep Learning - How It Works in Large and Small Language Models

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

10 Best Deep Learning Courses for 2024

Never Stop Learning.