Overview
Explore a groundbreaking approach to self-supervised reinforcement learning in this 35-minute video explanation. Dive into the Plan2Explore model, which enables agents to efficiently explore their environment without predefined rewards. Learn how this innovative method uses planning in a learned imaginary latent world model to seek out uncertain states, improving upon traditional intrinsic reward formulations. Discover the key components of the model, including intrinsic motivation, planning in latent space, and latent disagreement. Understand how Plan2Explore maximizes information gain and tackles challenges in reinforcement learning. Examine the experimental results and final insights provided by the presenter, Yannic Kilcher. Gain a comprehensive understanding of this novel technique that enhances sample efficiency and enables quick adaptation to multiple downstream tasks in zero or few-shot scenarios.
Syllabus
- Intro & Problem Statement
- Model
- Intrinsic Motivation
- Planning in Latent Space
- Latent Disagreement
- Maximizing Information Gain
- More problems with the model
- Experiments
- Final Comments
Taught by
Yannic Kilcher