Overview
Syllabus
- Introduction
- What are the main questions in this subfield?
- How causal tracing reveals where facts are stored
- Clever experiments show the importance of MLPs
- How do MLPs store information?
- How to edit language model knowledge with precision?
- What does it mean to know something?
- Experimental Evaluation & the CounterFact benchmark
- How to obtain the required latent representations?
- Where is the best location in the model to perform edits?
- What do these models understand about language?
- Questions for the community
Taught by
Yannic Kilcher