Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Population-Based Methods for Single- and Multi-Agent Reinforcement Learning - Lecture

USC Information Sciences Institute via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore population-based methods for single- and multi-agent reinforcement learning in this informative 51-minute lecture presented by Roy Fox from UCI at the USC Information Sciences Institute. Delve into ensemble methods for reinforcement learning, focusing on the MeanQ algorithm and its ability to reduce estimation variance without a stabilizing target network. Examine the curious theoretical equivalence of MeanQ to a non-ensemble method and its superior performance. Investigate double-oracle methods in adversarial environments, introducing the XDO algorithm that exploits sequential game structure to reduce worst-case population size. Learn about the speaker's research interests in reinforcement learning, algorithmic game theory, information theory, and robotics. Cover topics including RL basics, Deep Q-learning, ensemble RL methods, variance reduction techniques, and extensive-form double oracle algorithms. Gain insights into the latest advancements in single- and multi-agent reinforcement learning through this comprehensive seminar.

Syllabus

Welcome to the Al Seminar Series
Reinforcement Learning (RL)
RL basics
Deep Q-learning (DQN)
Why use target network?
Why reduce estimation variance
Ensemble RL methods
Ensemble RL for variance reduction
MeanQ design choices
Combining with existing techniques
Experiment results (100K interaction steps)
Obviating the target network
Comparing model size and update rate
MeanQ: variance reduction
Loss of ensemble diversity
Linear function approximation
Diversity through independent sampling
Ongoing investigation
Takeaways
Fictitious Play
What to do in large dynamical environments
PSRO convergence properties
Extensive-Form Double Oracle (XDO)
XDO: results
XDO convergence properties

Taught by

USC Information Sciences Institute

Reviews

Start your review of Population-Based Methods for Single- and Multi-Agent Reinforcement Learning - Lecture

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.