Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Uncovering and Inducing Interpretable Causal Structure in Deep Learning Models

Valence Labs via YouTube

Overview

Explore a comprehensive lecture on uncovering and inducing interpretable causal structure in deep learning models. Delve into the theory of causal abstraction as a foundation for creating faithful and interpretable explanations of AI model behavior. Learn about two approaches: analysis mode, which uses interventions on model-internal states to uncover causal structure, and training mode, which induces interpretable causal structure through interventions during model training. Examine case studies demonstrating these techniques applied to deep learning models processing language and images. The talk covers key concepts including causal abstraction, interchange interventions, and distributed alignment search, providing insights into creating more transparent and understandable AI systems.

Syllabus

- Discussant Slide
- Introduction
- Causal Abstraction
- Interchange Interventions
- Distributed Alignment Search

Taught by

Valence Labs

Reviews

Start your review of Uncovering and Inducing Interpretable Causal Structure in Deep Learning Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.