Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Watch a 54-minute lecture from MIT researcher Jonathan Shafer at the Simons Institute exploring strategies for defending against undetectable backdoors in machine learning systems. Learn about groundbreaking research building on Goldwasser et al.'s discovery that adversaries can plant computationally undetectable backdoors in ML models to covertly control their behavior. Explore novel defense approaches focused on backdoor removal rather than detection, drawing from program self-correction and random self-reducibility concepts. Discover two key mitigation techniques: a global approach for binary classification assuming ground-truth labels close to decision trees or Fourier-sparse functions, and local methods for regression with linear/polynomial ground-truth functions that offer computational efficiency. Understand how these black-box techniques work without requiring access to model code or parameters, including insights on robust mean estimation. The lecture presents joint work with Shafi Goldwasser, Neekon Vafa, and Vinod Vaikuntanathan addressing critical security concerns as society's reliance on machine learning grows.