Completed
1st Proof of Convergence to a Local Optima for Batch Policy Gradient
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
ADSI Summer Workshop: Algorithmic Foundations of Learning and Control - Emma Brunskill
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Legacy of Reinforcement Learning to Benefit People
- 3 Techniques to Minimize & Understand Data Needed to Learn to Make Good Decisions
- 4 Challenge: Covariate Shift Different Policies-- Different Actions - Different State Distributions
- 5 Quest: Batch Policy Optimization w/ Generalization Bounds
- 6 Recall: Importance Sampling for RL Batch Policy Evaluation
- 7 1st Proof of Convergence to a Local Optima for Batch Policy Gradient
- 8 Experiment Settings
- 9 HIV treatment simulator
- 10 Aim: Strong Generalization Guarantees on Policy Performance, Alternative: Guarantee Find Best in Class Policy
- 11 Example: Linear Thresholding Policies
- 12 An Advantage Decomposition
- 13 Advantage Doubly Robust (ADR) Estimator
- 14 Quest for Batch Policy Optimization with Generalization Guarantees