Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control - An Investigation Using CO2 Emission Objective

Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control - An Investigation Using CO2 Emission Objective

Eclipse Foundation via YouTube Direct link

Findings - Rewards are sensitive to parameterization

11 of 13

11 of 13

Findings - Rewards are sensitive to parameterization

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control - An Investigation Using CO2 Emission Objective

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 The Importance of Aligning Powerful AI Systems
  3. 3 Reinforcement Learning Example: Cliff Walking
  4. 4 Aligning TSC Agents with Rewards
  5. 5 Objective: Minimizing CO2 Emission at a Signalized Intersection
  6. 6 Reinforcement Learning Setup
  7. 7 Training the Neural Network - Deep Q-Network (DQN)
  8. 8 Motivation - Uninformative Emission Penalty
  9. 9 Informativeness and Expressiveness for Alignment
  10. 10 Findings Comparing Rewards
  11. 11 Findings - Rewards are sensitive to parameterization
  12. 12 Conclusion - Informativeness and Expressiveness are necessary
  13. 13 Technologies that helped a LOT

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.