ROME - Locating and Editing Factual Associations in GPT - Paper Explained & Author Interview

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore an in-depth analysis of how large language models store and recall factual associations in this comprehensive video lecture. Delve into the mechanisms behind GPT models' ability to store vast amounts of world knowledge and learn about a proposed method for targeted editing of such facts. Discover how causal tracing reveals where information is stored within the model, the importance of MLPs in this process, and how to edit language model knowledge with precision. Examine experimental evaluations, including the CounterFact benchmark, and consider the implications for understanding model inner workings and gaining greater control over AI systems. Gain insights into cutting-edge research on model editing and the nature of knowledge representation in artificial intelligence.

Syllabus

- Introduction
- What are the main questions in this subfield?
- How causal tracing reveals where facts are stored
- Clever experiments show the importance of MLPs
- How do MLPs store information?
- How to edit language model knowledge with precision?
- What does it mean to know something?
- Experimental Evaluation & the CounterFact benchmark
- How to obtain the required latent representations?
- Where is the best location in the model to perform edits?
- What do these models understand about language?
- Questions for the community