Information Theory of Deep Learning - Naftali Tishby

Overview

Explore the groundbreaking theory of large-scale learning with Deep Neural Networks in this Stanford University seminar. Delve into the correspondence between Deep Learning and the Information Bottleneck framework as presented by Dr. Naftali Tishby, a professor of Computer Science at the Hebrew University of Jerusalem. Discover a new generalization bound, the input-compression bound, and its importance for good generalization. Learn how mutual information on input and output variables in the last hidden layer characterizes sample complexity and accuracy in large-scale Deep Neural Networks. Understand how Stochastic Gradient Descent achieves optimal bounds in Deep Learning, providing insights into the benefits of hidden layers and design principles. Gain a comprehensive understanding of the interface between computer science, statistical physics, and computational neuroscience, including applications of statistical physics and information theory in computational learning theory.

Syllabus

Introduction
Neural Networks
Information Theory
Neural Network
Mutual Information
Information Paths
Questions
Typical Patterns
Cardinality
Finite Samples
Optimal Compression