ADSI Summer Workshop: Algorithmic Foundations of Learning and Control - Emma Brunskill
Paul G. Allen School via YouTube
Overview
Syllabus
Intro
Legacy of Reinforcement Learning to Benefit People
Techniques to Minimize & Understand Data Needed to Learn to Make Good Decisions
Challenge: Covariate Shift Different Policies-- Different Actions - Different State Distributions
Quest: Batch Policy Optimization w/ Generalization Bounds
Recall: Importance Sampling for RL Batch Policy Evaluation
1st Proof of Convergence to a Local Optima for Batch Policy Gradient
Experiment Settings
HIV treatment simulator
Aim: Strong Generalization Guarantees on Policy Performance, Alternative: Guarantee Find Best in Class Policy
Example: Linear Thresholding Policies
An Advantage Decomposition
Advantage Doubly Robust (ADR) Estimator
Quest for Batch Policy Optimization with Generalization Guarantees
Taught by
Paul G. Allen School