Risk-Sensitive Bandits - Arm Mixtures Optimality and Regret-Efficient Algorithms

Overview

Explore a technical lecture on risk-sensitive multi-armed bandit problems presented by Dr. Arpan Mukherjee from Imperial College London. Delve into a new framework for risk-aware sequential decision-making that unifies various risk measures through distortion riskmetrics. Learn about the groundbreaking observation that optimal strategies often require selecting arm mixtures rather than single arms, challenging conventional bandit algorithms. Discover newly developed algorithms designed to track optimal mixtures when risk measures favor them, and understand the technical challenges in establishing information-theoretic lower bounds for regret under mixtures-optimality settings. Examine open questions and future research directions in risk-sensitive decision-making, guided by Dr. Mukherjee, whose expertise spans signal processing, statistics, and machine learning, developed through his academic journey at Rensselaer Polytechnic Institute and IIT Kharagpur.

Syllabus

Time: 5:00 PM - PM IST

Taught by

Centre for Networked Intelligence, IISc

Reviews

Start your review of Risk-Sensitive Bandits - Arm Mixtures Optimality and Regret-Efficient Algorithms

Taught by

Almost-Optimal Best Restless Markov Arm Identification with Fixed Confidence

Collaborative Decision-Making Under Adversarial and Information Constraints

Adaptive Multi-armed Bandit Algorithms for Markovian and IID Rewards

Bandits II - Advanced Algorithms and Applications

Sample-Efficient Constrained Reinforcement Learning with General Parameterized Policies

Continuous-in-time Limit for Bandits

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.