Interpretability - Now What?

Overview

Explore the frontiers of deep learning in this 48-minute talk by Been Kim from Google Brain, focusing on interpretability in machine learning. Delve into the goals and non-goals of interpretability, and learn about benchmarking interpretability methods (BIM) using three metrics for measuring false positives. Discover the Model Contrast Score (MCS) and the concept of Concept Activation Vectors (CAV). Examine the TCAV (Testing with Concept Activation Vectors) approach, including its core ideas, quantitative validation, and applications in image prediction models and medical diagnosis. Gain insights from human subject experiments comparing saliency maps, and understand the limitations and considerations of TCAV. Reflect on the broader implications and challenges in the field of interpretable machine learning.

Syllabus

Intro
My goal interpretability
NON-goals
Investigating
Sanity check question.
Benchmarking interpretability methods (BIM)
Three metrics for measuring false positives
Model Contrast Score (MCS)
Defining concept activation vector (CAV) Inputs
TCAV core idea: Derivative with CAV to get prediction sensitivity
Quantitative validation: Guarding against spurious CAV
Recap TCAV: Testing with Concept Activation Vectors
Sanity check experiment setup
Human subject experiment: Can saliency maps communicate the same information?
TCAV in Two widely used image prediction models
Collect human doctor's knowledge
TCAV for Diabetic Retinopathy
Summary: Testing with Concept Activation Vectors
Responses from inside of academia
Limitations of TCAV
Things to keep in mind during our journey

Taught by

Simons Institute

Reviews

Start your review of Interpretability - Now What?

Taught by

10 Best Deep Learning Courses for 2024

Never Stop Learning.