Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Reinforcement Learning

Simons Institute via YouTube

Overview

Explore reinforcement learning in this 31-minute lecture by Martin Wainwright from UC Berkeley, presented at the Foundations of Data Science Institute Kickoff Workshop. Gain a bird's-eye view of RL and its application in personal health. Delve into exploiting structure in RL, including Q-learning with low rank structure and comparing model-free versus model-based methods. Examine the performance of LSTD versus model-based methods in linear state space models with quadratic reward functions. Investigate exploration/exploitation beyond bandits, including Q-learning with UCB and Monte Carlo Tree Search. Consider the concept of instance-optimality in RL, particularly in policy evaluation. Finally, explore RL in offline settings and its connections to causal inference, touching on instrumental variables, propensity scores, doubly robust methods, and synthetic controls.

Syllabus

Intro
Birds-eye view of RL
Illustrative application: RL in personal health
General thrust
Direction: Exploiting structure in RL
Vignette: Q-learning with low rank structure
Vignette: Model-free versus model-based method
Estimate dynamics or value functions for LQR? - Linear state space model with quadratic reward function
Performance of LSTD versus model-based metho
Direction: Exploration/exploitation beyond bandi
Vignette: Q-learning with UCB
Vignette: UCB and Monte Carlo Tree Search
Direction: From worst-case to instance-optimalit
Vignette: Instance-optimality of TD learning?
Instance-optimality in policy evaluation
Direction: RL in offline settings and causal inferen
Some future directions exploiting methods from cal inferences instrumental variables propensity score, doubly robust methods, synthetic controls

Taught by

Simons Institute

Reviews

Start your review of Reinforcement Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.