Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control - An Investigation Using CO2 Emission Objective

Eclipse Foundation via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Watch a 21-minute conference talk exploring the challenges of designing effective reward systems for Deep Reinforcement Learning (DRL) in traffic signal control, with a focus on minimizing CO2 emissions at intersections. Dive into the complexities of training DRL agents using the SUMO (Simulation of Urban MObility) simulator, examining how different reward metrics and combinations affect agent performance. Learn why emission-based rewards prove inefficient for training Deep Q-Networks (DQN), discover the sensitivity of agent performance to reward parameter variations, and understand why certain reward formulations perform inconsistently across different scenarios. Explore key findings about reward properties that impact reinforcement learning-based traffic signal control, including the importance of informativeness and expressiveness in reward design. Follow along as presenters Christian Medeiros Adriano and Max Schumacher demonstrate practical examples like Cliff Walking, discuss the alignment of Traffic Signal Control agents with rewards, and share valuable insights about technologies that significantly enhanced their research outcomes.

Syllabus

Intro
The Importance of Aligning Powerful AI Systems
Reinforcement Learning Example: Cliff Walking
Aligning TSC Agents with Rewards
Objective: Minimizing CO2 Emission at a Signalized Intersection
Reinforcement Learning Setup
Training the Neural Network - Deep Q-Network (DQN)
Motivation - Uninformative Emission Penalty
Informativeness and Expressiveness for Alignment
Findings Comparing Rewards
Findings - Rewards are sensitive to parameterization
Conclusion - Informativeness and Expressiveness are necessary
Technologies that helped a LOT

Taught by

Eclipse Foundation

Reviews

Start your review of Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control - An Investigation Using CO2 Emission Objective

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.