Overview
Explore a 51-minute DS4DM Coffee Talk on Hindsight Learning for Markov Decision Processes (MDPs) with Exogenous Inputs, presented by Sean Sinclair from MIT. Dive into the world of sequential decision-making under uncertainty, focusing on resource management problems where exogenous variables outside the decision-maker's control affect outcomes. Learn about Exo-MDPs and the innovative class of data-efficient algorithms called Hindsight Learning (HL). Discover how HL algorithms achieve efficiency by leveraging past decisions to infer counterfactual consequences, accelerating policy improvements. Compare HL against classic baselines in multi-secretary and airline revenue management problems. Examine the scalability of these algorithms in a critical cloud resource management scenario: allocating Virtual Machines (VMs) to physical machines, with simulations using real datasets from a major public cloud provider. Gain insights into how HL algorithms outperform domain-specific heuristics and state-of-the-art reinforcement learning methods in various applications.
Syllabus
Hindsight Learning for MDPs with Exogenous Inputs, Sean Sinclair
Taught by
GERAD Research Center