Interpretable Explanations of Black Boxes by Meaningful Perturbation - CAP6412 Spring 2021

Overview

Explore a 27-minute lecture on interpretable explanations of black box models using meaningful perturbation. Delve into current challenges in interpreting black box predictors and the limitations of intuitive visualization methods. Learn about principles and methods developed to explain any black box function by determining mapping attributes and internal mechanisms. Compare various saliency methods, including gradient-based approaches and Class Activation Mapping (CAM). Understand the concept of explanations as meta-predictors and their advantages in measuring image faithfulness and finding automated explanations. Examine local explanations, meaningful image perturbations, and techniques for deletion and preservation. Discover experiments on interpretability, animal part saliency, adversarial defense, and localization. Gain insights into the development of more transparent and understandable machine learning models.

Syllabus

Intro
Content
Abstract Image Saliency Methods Summary Attention Map Limited by heuristic properties and architectural constraints
Introduction Current Problems The interpretation for the black box predictor The intuitive visualization method is only heuristic, and the meaning remains unclear.
Contribution Develop principles and methods to explain any black box function By determine mapping attributes - Internal mechanisms is used to implement these attributes
Related Work Gradient-based method -Backpropagates the gradient for a class label to the image layer Other methods: DeConvNet, Guided Backprop
Related Work - CAM
Related Work Comparison
Comparison with other saliency methods
Principle Black bax is a mapping function
Explanations as meta-predictors Rules are used to explain a robin classifier
Advantages of Explanations as Meta-predictors The faithfulness of images can be measured as prediction accuracy To find the explanations automatically
Local Explanations
Saliency Deleting parts of image x, as the perturbations for the whole image X
A Meaningful Image Perturbation 11
Deletion and Preservation
Artifacts Reduction
Experiment-Interpretability
Experiment Testing hypotheses: animal part saliency
Experiment-Adversarial defense
Experiment localization and pointing
Conclusion
Questions?

Taught by

UCF CRCV

Reviews

Start your review of Interpretable Explanations of Black Boxes by Meaningful Perturbation - CAP6412 Spring 2021

Taught by

Tags

Explainable deep learning models for healthcare - CDSS 3

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.