Fairness without Imputation: A Decision Tree Approach for Fair Prediction with Missing Values
Harvard CMSA via YouTube
Overview
Watch a research presentation from the 2022 Symposium on Foundations of Responsible Computing where Harvard University researcher Haewon Jeong explores fair machine learning approaches for handling missing data values. Learn about the challenges of applying fairness interventions when data missing patterns correlate with sensitive group attributes like gender or race. Discover a novel decision tree-based method that avoids explicit data imputation while optimizing for fairness through a regularized objective function. Follow along as the speaker analyzes different sources of discrimination risk in imputed datasets, presents theoretical foundations, and demonstrates experimental results showing how this integrated approach outperforms existing fairness intervention methods on real-world datasets with missing values. Gain insights into biased imputation, mismatched imputation methods, and future research directions for developing fair machine learning systems that can handle incomplete data responsibly.
Syllabus
Intro
Data Missing Patterns Depend on Sensitive Group Attributes
Group Fairness Metrics
Fair Machine Learning Literature
Fair Learning With Missing Values
Main Contributions
Biased imputation method
Mismatched imputation methods
Imputation without being aware of the downstream task
Numerical Results of Fair MIP Forest Algorithm
Future Directions
Taught by
Harvard CMSA