Overview
Explore a seminar on advanced optimization techniques for solving large-scale Markov decision processes. Delve into the research of Joan Bas Serrano, a PhD student focusing on theoretical reinforcement learning at Universitat Pompeu Fabra. Examine the linear programming formulation of MDPs and saddle-point optimization theory applied to average-reward Markov decision processes. Discover a novel approach to computing optimal policies using a linearly relaxed version of the saddle-point problem. Analyze the conditions necessary for convergence to the optimal policy and learn about an optimization algorithm designed for fast convergence rates independent of state space size. Gain insights into potential issues with previous work in this area and understand the implications for future research in reinforcement learning algorithms.
Syllabus
Seminar Series: Faster saddle-point optimization for solving large-scale Markov decision processes
Taught by
VinAI