Overview
Explore a comprehensive tutorial lecture on falsifiable interpretability research in machine learning for computer vision. Delve into key concepts including saliency, input invariants, model parameter randomization, and the impact of silencing on human understanding. Examine case studies on individual neurons, activation maximization, and selective units. Learn about building stronger hypotheses and gain valuable insights into the challenges and potential solutions in interpretable machine learning. Discover techniques for regularizing selectivity in generative models and understand the importance of developing robust, testable hypotheses in this field.
Syllabus
Introduction
Outline
Obstacles
Misdirection of Saliency
What is Saliency
Saliency axioms
Input invariants
Model parameter randomization
Does silencing help humans
Takeaways
Case Study 2
Individual neurons
Activation maximization
Populations
Selective units
Ablating selective units
Posthoc studies
Regularizing selectivity
Ingenerative models
Summary
Building better hypothesis hypotheses
Building a stronger hypothesis
Key takeaways
Taught by
Bolei Zhou