Completed
Future directions
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Optimal Gradient-Based Algorithms for Non-Concave Bandit Optimization
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Bandit Problem
- 3 Our focus: beyond linearity and concavity
- 4 Problem li the Stochastic Bandit Eigenvector Problem
- 5 Some related work
- 6 Information theoretical understanding
- 7 Beyond cubic dimension dependence
- 8 Our methodnoisy power method
- 9 Problem i Stochastic Low-rank linear reward
- 10 Our algorithm: noisy subspace iteration
- 11 Regret comparisons: quadratic reward
- 12 Higher-order problems
- 13 Problem : Symmetric High-order Polynomial bandit
- 14 Problem IV: Asymmetric High-order Polynomial bandit
- 15 Lower bound: Optimal dependence on a
- 16 Overall Regret Comparisons
- 17 Extension to RL in simulator setting
- 18 Conclusions We find optimal regret for different types of reward function
- 19 Future directions