Language Models as World Models? - Understanding Representations and Semantic Control

Overview

Explore the intriguing question of whether language models can truly represent the world described in text in this thought-provoking talk by Jacob Andreas from MIT. Delve into recent research examining how transformer language models encode interpretable and controllable representations of facts and situations. Discover evidence from probing experiments suggesting that language model representations contain rudimentary information about entity properties and dynamic states, and how these representations influence downstream language generation. Examine the limitations of even the largest language models, including their tendency to hallucinate facts and contradict input text. Learn about the "representation editing" model REMEDI, designed to correct semantic errors by intervening in language model activations. Consider recent experiments that reveal the complexity of accessing and manipulating language models' "knowledge" through simple probes. Gain insights into the ongoing challenges in building transparent and controllable world models for language generation systems.