Introduction to Reinforcement Learning

Introduction to Reinforcement Learning

Open Data Science via YouTube Direct link

Exploration vs Exploitation

28 of 46

28 of 46

Exploration vs Exploitation

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Introduction to Reinforcement Learning

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Part One: Reinforcement Learning (RL)
  3. 3 Applications: Board Games
  4. 4 Applications: 2D Video Games
  5. 5 Applications: Simulated 3D Robotics
  6. 6 Applications: Robotics
  7. 7 Applications: "World Models"
  8. 8 Applications: Language grounding
  9. 9 Applications: Multi-agent collaboration
  10. 10 The Formulation
  11. 11 Agent-Environment Loop in code
  12. 12 Core Concepts: State(s)
  13. 13 Core Concepts: Complex State(s)
  14. 14 Core Concepts: Reward(s)
  15. 15 Core Concepts: Return and Discount → The Return Gt is the total discounted reward from time-stept
  16. 16 Core Concepts: Value Function(s)
  17. 17 Core Concepts: Policies
  18. 18 Core Concepts: Markov Assumption
  19. 19 Core Concepts: Markov Decision Process
  20. 20 Model-based: Dynamic Programming
  21. 21 Model-based Reinforcement Learning
  22. 22 Bellman equation
  23. 23 Policy evaluation example
  24. 24 Generalized Policy Iteration
  25. 25 GridWorlds: Sokoban
  26. 26 The rest of the iceberg
  27. 27 Continuous action/state spaces
  28. 28 Exploration vs Exploitation
  29. 29 Credit Assignment
  30. 30 Sparse, noisy and delayed rewards
  31. 31 Reward hacking
  32. 32 Model-free: Reinforcement Learning
  33. 33 Monte Carlo evaluation
  34. 34 Temporal difference evaluation
  35. 35 Q-learning: Tabular setting
  36. 36 OpenAl gym
  37. 37 DeepMind Lab
  38. 38 Part Two: Deep Reinforcement Learning
  39. 39 Value function approximation
  40. 40 Policy Gradients: Baseline and Advantage
  41. 41 Policy Gradients: Actor-Critic for Starcraft 2
  42. 42 Policy Gradients: PPO for DotA
  43. 43 Policy Gradients: PPO for robotics
  44. 44 Policy Gradients: Sonic Retro Contest
  45. 45 Big picture view of the main algorithms
  46. 46 More RL applications

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.