Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a groundbreaking lecture on the emergence of complex skills in language models presented by Sanjeev Arora from Princeton University. Delve into the fascinating world of Large Language Models and Transformers, examining the poorly understood phenomenon of new skills emerging as parameter sets and training corpora are scaled up. Discover a novel approach that analyzes emergence using empirical Scaling Laws of LLMs and a simple statistical framework. Learn about the contributions of this research, including a statistical framework relating cross-entropy loss to competence on basic language task skills, mathematical analysis revealing a strong form of inductive bias called "slingshot generalization," and an example demonstrating how competence in executing tasks involving k-tuples of skills emerges at the same scaling and rate as elementary skills. Gain valuable insights into this cutting-edge research that challenges conventional generalization theory and offers new perspectives on the capabilities of language models.
Syllabus
A Theory for Emergence of Complex Skills in Language Models
Taught by
Simons Institute