Overview
Syllabus
Intro
Conventional Paradigm: Supervised Learning
Key Challenge of Supervised Learning
Road Map
Background on Self-supervised Learning
Data Augmentation
Pre-training an Encoder - SimCLR [ICML'20]
Building a Downstream Classifier
Backdoor Attack
Key Idea of Our Attack
Quantifying Effectiveness Goal
Quantifying Condition I
Quantifying Utility Goal
Optimization Problem
Attack Setting
Attack Success Rate
Clean Accuracy and Backdoored Accuracy
Evaluation on Real-world Pre-trained Encoders
Existing Defenses are Insufficient
Summary
Motivation on Data Auditing
Auditing Unauthorized Data Use
Examples of Real-world Unauthorized Data Use
Our EncoderMI: Membership Inference based Data Auditing for Pre-trained Encoders
Revisiting Encoder Pre-training
Shadow Training Setup
Pre-training a Shadow Encoder
Constructing a Training Set for Inference Classifier
Building an Inference Classifier
Experimental Setup
Evaluation on CLIP
Conclusion
Taught by
Google TechTalks