Completed
What to do in large dynamical environments
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Population-Based Methods for Single- and Multi-Agent Reinforcement Learning - Lecture
Automatically move to the next video in the Classroom when playback concludes
- 1 Welcome to the Al Seminar Series
- 2 Reinforcement Learning (RL)
- 3 RL basics
- 4 Deep Q-learning (DQN)
- 5 Why use target network?
- 6 Why reduce estimation variance
- 7 Ensemble RL methods
- 8 Ensemble RL for variance reduction
- 9 MeanQ design choices
- 10 Combining with existing techniques
- 11 Experiment results (100K interaction steps)
- 12 Obviating the target network
- 13 Comparing model size and update rate
- 14 MeanQ: variance reduction
- 15 Loss of ensemble diversity
- 16 Linear function approximation
- 17 Diversity through independent sampling
- 18 Ongoing investigation
- 19 Takeaways
- 20 Fictitious Play
- 21 What to do in large dynamical environments
- 22 PSRO convergence properties
- 23 Extensive-Form Double Oracle (XDO)
- 24 XDO: results
- 25 XDO convergence properties