Overview
Explore the fundamentals of Bayesian and contextual bandits in this comprehensive lecture. Delve into multi-armed bandits, Bayesian learning, and distributional information before examining a coin example to understand Bernoulli variables and Beta distribution. Compare approaches and learn about belief updates before diving into the Thompson Sampling Algorithm for Bernoulli rewards. Conclude by investigating contextual bandits, approximate reward functions, and predictive posteriors, gaining a solid foundation in these crucial machine learning concepts.
Syllabus
Intro
Multi-Armed Bandits
Bayesian Learning
Distributional Information
Coin Example
Bernoulli Variables
Beta distribution
Comparison
Belief Update
Thompson Sampling Algorithm Bernoulli Rewards
Contextual Bandits
Approximate Reward Function
Predictive Posterior
Taught by
Pascal Poupart