Does Learning Require Memorization? A Short Tale About a Long Tail - Vitaly Feldman, Google Brain

Overview

Explore the necessity of memorization in learning algorithms through a conference talk that delves into the intricacies of deep neural networks and their tendency to memorize training data. Examine the privacy risks associated with this memorization and discover a theoretical model that explains why memorizing labels is crucial for achieving optimal generalization error in natural data distributions. Learn about experimental results on standard benchmarks that demonstrate the essential role of memorization in deep learning. Gain insights into the cross-fertilization between statistics and computer science in the era of Big Data, and understand how this intersection has led to the development of modern machine learning paradigms.

Syllabus

Intro
Privacy concerns
Current state of the art
Related work
Toy Story
Subpopulations
Long tail distribution
Model description
Prior over frequencies
Memorization
Coupling
Experimental Approach
Results
Evidence for memorization
Conclusion