Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation: Tail Averaging and Regularization
Centre for Networked Intelligence, IISc via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Watch a technical lecture exploring the finite-time analysis of temporal difference (TD) learning with linear function approximation, delivered by Prof. Prashanth L.A. from IIT Madras. Dive deep into the mathematical analysis of tail-averaged TD learning, examining how it achieves optimal O(1/t) convergence rates both in expectation and with high probability. Learn about a novel step-size selection approach that doesn't require eigenvalue information from the projected TD fixed point matrix. Discover how tail-averaging improves the decay rate of initial error compared to full-iterate averaging, and explore a regularized TD variant that performs well with ill-conditioned features. The lecture draws from research accepted at AISTATS 2023, presented by an accomplished researcher whose work spans reinforcement learning, simulation optimization, and multi-armed bandits, with applications across transportation systems, wireless networks, and recommendation systems.
Syllabus
Finite time analysis of temporal difference learning with linear function approximation
Taught by
Centre for Networked Intelligence, IISc