Overview
Explore the concept of differentially private sampling from distributions in this 45-minute Google TechTalk presented by Marika Swanberg. Dive into an investigation of private sampling techniques, focusing on generating small amounts of realistic-looking data while maintaining privacy. Learn about tight upper and lower bounds for dataset sizes needed for sampling from various distribution families, including arbitrary distributions on finite sets and product distributions on binary vectors. Discover how private sampling compares to non-private learning in terms of required observations, and understand scenarios where it may require fewer or similar amounts of data. Examine the relationship between private sampling and private learning, and how the overhead in observations for private learning is sometimes captured by private sampling requirements. Gain insights into the research presented at NeurIPS, covering topics such as differential privacy, cryptography, and their intersection with legal questions. Follow along as the speaker discusses the properties of differential privacy, sampling accuracy, related work, and various techniques used in differentially private sampling.
Syllabus
Intro
Private Data Analysis
Properties of Differential Privacy
Private Sampling
Sampling Accuracy
Context
Summary of Contributions
Related Work
Sample Complexity of DP Sampling
Techniques
Simple Example: Bernoulli with Bounded Bias
Frequency-Count-Based Sampler
Overview: k-ary lower bound
Proof of Key Lemma
Putting it all together: k-ary LB
Removing the assumptions
Taught by
Google TechTalks