Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the theory and algorithms behind adversarial multi-armed bandit problems in this comprehensive lecture by Haipeng Luo from USC. Delve into the intersection of online learning and bandit literature, focusing on sequential decision-making without distributional assumptions and learning with partial information feedback. Begin with an overview of classical algorithms and their analysis before progressing to recent advances in data-dependent regret guarantees, structural bandits, bandits with switching costs, and combining bandit algorithms. Compare and contrast online learning with full-information feedback versus bandit feedback, gaining valuable insights into this influential field of study.