Trust Region Policy Optimization

Trust Region Policy Optimization

Pascal Poupart via YouTube Direct link

RL to Optimization

4 of 11

4 of 11

RL to Optimization

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Trust Region Policy Optimization

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Reinforcement Learning
  3. 3 Problems of Policy Gradient
  4. 4 RL to Optimization
  5. 5 What loss to optimize?
  6. 6 New State Visitation is Difficult
  7. 7 Minorization Maximization (MM) algorithm
  8. 8 Solving KL-Penalized Problem
  9. 9 Conjugate Gradient (CG)
  10. 10 TRPO: KL-Constrained
  11. 11 TRPO Algorithm

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.