Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Simons Institute via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore formal languages and automata for reward function specification and efficient reinforcement learning in this comprehensive lecture by Sheila McIlraith from the University of Toronto. Delve into the challenges of real-world reinforcement learning, focusing on goals and preferences expression. Examine Linear Temporal Logic (LTL) as a compelling method for expressing temporal properties of traces. Discover the concept of reward machines and their application in defining reward functions. Compare various reinforcement learning methods, including Q-Learning, Option-Based Hierarchical RL, and Q-Learning for Reward Machines (QRM). Analyze experimental results from discrete domains, Office World, Minecraft World, and Water World. Investigate techniques for creating reward machines, including construction from formal languages and generation using symbolic planners. Gain insights into reward specification and its application in partially-observable reinforcement learning environments.

Syllabus

Intro
Acknowledgements
Reinforcement Learning (RL)
Challenges of Real-World RL
Goals and Preferences
Linear Temporal Logic (LTL) A compelling logic to express temporal properties of traces.
Challenges to RL
Toy Problem Disclaimer
Running Example
Decoupling Transition and Reward Functions
The Rest of the Talk
Define a Reward Function using a Reward Machine
Reward Function Vocabulary
Simple Reward Machine
Reward Machines in Action
Other Reward Machines
Q-Learning Baseline
Option-Based Hierarchical RL (HRL)
HRL with RM-Based Pruning (HRL-RM)
HRL Methods Can Find Suboptimal Policies
Q-Learning for Reward Machines (QRM)
QRM In Action
Recall: Methods for Exploiting RM Structure
5. QRM + Reward Shaping (QRM + RS)
Test Domains
Test in Discrete Domains
Office World Experiments
Minecraft World Experiments
Function Approximation with QRM
Water World Experiments
Creating Reward Machines
Reward Specification: one size does not fit all
1. Construct Reward Machine from Formal Languages
Generate RM using a Symbolic Planner
Learn RMs for Partially-Observable RL

Taught by

Simons Institute

Reviews

Start your review of Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.