Completed
Core Concepts: Markov Assumption
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Introduction to Reinforcement Learning
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Part One: Reinforcement Learning (RL)
- 3 Applications: Board Games
- 4 Applications: 2D Video Games
- 5 Applications: Simulated 3D Robotics
- 6 Applications: Robotics
- 7 Applications: "World Models"
- 8 Applications: Language grounding
- 9 Applications: Multi-agent collaboration
- 10 The Formulation
- 11 Agent-Environment Loop in code
- 12 Core Concepts: State(s)
- 13 Core Concepts: Complex State(s)
- 14 Core Concepts: Reward(s)
- 15 Core Concepts: Return and Discount → The Return Gt is the total discounted reward from time-stept
- 16 Core Concepts: Value Function(s)
- 17 Core Concepts: Policies
- 18 Core Concepts: Markov Assumption
- 19 Core Concepts: Markov Decision Process
- 20 Model-based: Dynamic Programming
- 21 Model-based Reinforcement Learning
- 22 Bellman equation
- 23 Policy evaluation example
- 24 Generalized Policy Iteration
- 25 GridWorlds: Sokoban
- 26 The rest of the iceberg
- 27 Continuous action/state spaces
- 28 Exploration vs Exploitation
- 29 Credit Assignment
- 30 Sparse, noisy and delayed rewards
- 31 Reward hacking
- 32 Model-free: Reinforcement Learning
- 33 Monte Carlo evaluation
- 34 Temporal difference evaluation
- 35 Q-learning: Tabular setting
- 36 OpenAl gym
- 37 DeepMind Lab
- 38 Part Two: Deep Reinforcement Learning
- 39 Value function approximation
- 40 Policy Gradients: Baseline and Advantage
- 41 Policy Gradients: Actor-Critic for Starcraft 2
- 42 Policy Gradients: PPO for DotA
- 43 Policy Gradients: PPO for robotics
- 44 Policy Gradients: Sonic Retro Contest
- 45 Big picture view of the main algorithms
- 46 More RL applications