Generalization in the Representations and Computations of Frontier Language Models

Overview

Watch a 52-minute lecture by Joshua Batson from Anthropic exploring the fascinating world of frontier language model generalization across multiple domains. Delve into how these models demonstrate remarkable adaptability in handling various languages, programming languages, and encoding schemes, while also excelling at stylistic analogies. Examine the geometry of LLM representations through the lens of 'superposition' and discover insights from sparse autoencoders. Learn about surprising qualitative features found in frontier models, while exploring critical questions about model representations and their implications for computation, training, failure modes, and reliability. Despite their impressive capabilities, understand the inherent brittleness these models display when tackling reasoning tasks and their sensitivity to question formulation and variable naming.