Reinforcement Learning

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore reinforcement learning in this 31-minute lecture by Martin Wainwright from UC Berkeley, presented at the Foundations of Data Science Institute Kickoff Workshop. Gain a bird's-eye view of RL and its application in personal health. Delve into exploiting structure in RL, including Q-learning with low rank structure and comparing model-free versus model-based methods. Examine the performance of LSTD versus model-based methods in linear state space models with quadratic reward functions. Investigate exploration/exploitation beyond bandits, including Q-learning with UCB and Monte Carlo Tree Search. Consider the concept of instance-optimality in RL, particularly in policy evaluation. Finally, explore RL in offline settings and its connections to causal inference, touching on instrumental variables, propensity scores, doubly robust methods, and synthetic controls.

Syllabus

Intro
Birds-eye view of RL
Illustrative application: RL in personal health
General thrust
Direction: Exploiting structure in RL
Vignette: Q-learning with low rank structure
Vignette: Model-free versus model-based method
Estimate dynamics or value functions for LQR? - Linear state space model with quadratic reward function
Performance of LSTD versus model-based metho
Direction: Exploration/exploitation beyond bandi
Vignette: Q-learning with UCB
Vignette: UCB and Monte Carlo Tree Search
Direction: From worst-case to instance-optimalit
Vignette: Instance-optimality of TD learning?
Instance-optimality in policy evaluation
Direction: RL in offline settings and causal inferen
Some future directions exploiting methods from cal inferences instrumental variables propensity score, doubly robust methods, synthetic controls