Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the foundations and current perspectives of stochastic bandits in this comprehensive lecture by Shipra Agrawal from Columbia University. Delve into the fundamental model of sequential learning, where rewards from different actions are assumed to come identically and independently from fixed distributions. Gain insights into the main algorithms for stochastic bandits, including Upper Confidence Bound and Thompson Sampling. Discover how these algorithms can be adapted to incorporate various additional constraints. This talk, part of the Data-Driven Decision Processes Boot Camp at the Simons Institute, provides a thorough examination of this crucial topic in sequential learning and decision-making processes.