Completed
Reinforcement Learning Example: Cliff Walking
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control - An Investigation Using CO2 Emission Objective
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 The Importance of Aligning Powerful AI Systems
- 3 Reinforcement Learning Example: Cliff Walking
- 4 Aligning TSC Agents with Rewards
- 5 Objective: Minimizing CO2 Emission at a Signalized Intersection
- 6 Reinforcement Learning Setup
- 7 Training the Neural Network - Deep Q-Network (DQN)
- 8 Motivation - Uninformative Emission Penalty
- 9 Informativeness and Expressiveness for Alignment
- 10 Findings Comparing Rewards
- 11 Findings - Rewards are sensitive to parameterization
- 12 Conclusion - Informativeness and Expressiveness are necessary
- 13 Technologies that helped a LOT