Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Adaptive Multi-armed Bandit Algorithms for Markovian and IID Rewards

Centre for Networked Intelligence, IISc via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a technical lecture on multi-armed bandit (MAB) algorithms that addresses both Markovian and independent and identically distributed (i.i.d.) reward scenarios. Delve into the challenges of obtaining regret guarantees for MAB problems where arm rewards form Markov chains outside single parameter exponential families. Learn about a groundbreaking algorithm that employs total variation distance-based testing to identify whether rewards are Markovian or i.i.d., enabling dynamic adaptation between standard and specialized Kullback-Leibler upper confidence bound (KL-UCB) approaches. Delivered by Prof. Arghyadip Roy from IIT Guwahati's Mehta Family School of Data Science and Artificial Intelligence, drawing from his extensive research experience in stochastic systems optimization, wireless network resource allocation, and reinforcement learning gained through his work at institutions including IIT Bombay, University of Illinois at Urbana-Champaign, and Jadavpur University.

Syllabus

Time: 5:00– PM

Taught by

Centre for Networked Intelligence, IISc

Reviews

Start your review of Adaptive Multi-armed Bandit Algorithms for Markovian and IID Rewards

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.