Overview
Syllabus
Announcements
It's hard to opt-out
Confidence scores
Explanations in plain English free-text / chain-of-thoughts
Input attribution gradient-based & select-then-predict
Feature interactions effective attention
Concept-based explanations TCAV
Data influence influence functions
Contrastive explanations contrastive editing
Explainability as a dialog
Taxonomy of evaluation of explanations
Simulatability
Why are application-grounded evals of explanations scarce in NLP?
Application-grounded evaluations
Trust in AI
Taught by
UofU Data Science