Overview
Explore model interpretation in neural networks for natural language processing through this comprehensive lecture from CMU's CS 11-747 course. Delve into the importance of interpretability, compare different approaches, and examine specific techniques such as source syntax in neural machine translation and fine-grained analysis of sentence embeddings. Learn about evaluation methods, including automatic evaluation and various explanation techniques like influence functions, gradient-based importance scores, and extractive rationale generation. Gain insights into future directions in the field and enhance your understanding of how to analyze and interpret complex neural network models for NLP tasks.
Syllabus
Intro
Why interpretability?
Dictionary definition
Two broad themes
Comparing two directions
Source Syntax in NMT
Why neural translations are the right length?
Fine grained analysis of sentence embeddings
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
How to evaluate?
Automatic evaluation
Explanation Technique: Influence Functions
Explanation Techniques: gradient based importance scores
Explanation Technique: Extractive Rationale Generation
Future Directions
Taught by
Graham Neubig