Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

A Theory for Emergence of Complex Skills in Language Models

Simons Institute via YouTube

Overview

Explore a groundbreaking lecture on the emergence of complex skills in language models presented by Sanjeev Arora from Princeton University. Delve into the fascinating world of Large Language Models and Transformers, examining the poorly understood phenomenon of new skills emerging as parameter sets and training corpora are scaled up. Discover a novel approach that analyzes emergence using empirical Scaling Laws of LLMs and a simple statistical framework. Learn about the contributions of this research, including a statistical framework relating cross-entropy loss to competence on basic language task skills, mathematical analysis revealing a strong form of inductive bias called "slingshot generalization," and an example demonstrating how competence in executing tasks involving k-tuples of skills emerges at the same scaling and rate as elementary skills. Gain valuable insights into this cutting-edge research that challenges conventional generalization theory and offers new perspectives on the capabilities of language models.

Syllabus

A Theory for Emergence of Complex Skills in Language Models

Taught by

Simons Institute

Reviews

Start your review of A Theory for Emergence of Complex Skills in Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.