Stochastic Variance Reduction Methods for Policy Evaluation

Overview

Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.

Syllabus

Intro
Reinforcement Learning
Policy Evaluation (PE)
Main Results
Notation
Objective Function for PE
Outline
Challenge with MSPBE
MSPBE as Saddle-Point Problem
Primal Dual Batch Gradient for Low
Stochastic Gradient Descent for L(0,w)
Stochastic Variance Reduced Gradient (SVRG)
SAGA
Extensions
Complexity: Summary
Preliminary Experiments
Experiments: Benchmarks
Random MDPS
Mountain Car
Previous Work
Conclusions

Taught by

Simons Institute

Reviews

Start your review of Stochastic Variance Reduction Methods for Policy Evaluation

From Zero to Cybersecurity Analyst

Most common

Popular subjects

Popular courses

Stochastic Variance Reduction Methods for Policy Evaluation

Overview

Syllabus

Taught by

Reviews

From Zero to Cybersecurity Analyst

Taught by

AI skills for Engineers: Supervised Machine Learning

Stochastic Gradient Descent

On Gradient-Based Optimization - Accelerated, Distributed, Asynchronous and Stochastic

Stochastic Gradient Descent Methods with Biased Estimators

Gradient Descent, Stochastic Gradient Descent, and Acceleration - Part 2

Gradient Descent, Step-by-Step

Never Stop Learning.