On the Statistical Complexity of Reinforcement Learning
Institute for Pure & Applied Mathematics (IPAM) via YouTube
Overview
Syllabus
Intro
Tabular Markov decision process
Prior efforts: algorithms and sample complexity results
Minimax optimal sample complexity of tabular MDP
Adding some structure: state feature map
Representing value function using linear combination of features
Rethinking Bellman equation
Reducing Bellman equation using features
Sample complexity of RL with features
Of-Policy Policy Evaluation (OPE)
OPE with function approximation
Equivalence to plug-in estimation
Minimax-optimal batch policy evaluation
Lower Bound Analysis
Episodic Reinforcement Learning
Feature space embedding of transition kernel
Regret Analysis
Exploration with Value-Targeted Regression VTAL
Taught by
Institute for Pure & Applied Mathematics (IPAM)