Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe

02/22/2017
by   Quentin Berthet, et al.
0

We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback. In this problem, the objective is to minimize a global loss function of all the actions, not necessarily a cumulative loss. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To solve this problem, we analyze the Upper-Confidence Frank-Wolfe algorithm, inspired by techniques for bandits and convex optimization. We give theoretical guarantees for the performance of this algorithm over various classes of functions, and discuss the optimality of these results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2019

Data Poisoning Attacks on Stochastic Bandits

Stochastic multi-armed bandits form a class of online learning problems ...
research
10/01/2018

Risk-Averse Stochastic Convex Bandit

Motivated by applications in clinical trials and finance, we study the p...
research
07/31/2015

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

We consider the closely related problems of bandit convex optimization w...
research
08/12/2020

Non-Stochastic Control with Bandit Feedback

We study the problem of controlling a linear dynamical system with adver...
research
09/22/2016

(Bandit) Convex Optimization with Biased Noisy Gradient Oracles

Algorithms for bandit convex optimization and online learning often rely...
research
02/12/2022

Adaptive Bandit Convex Optimization with Heterogeneous Curvature

We consider the problem of adversarial bandit convex optimization, that ...
research
05/28/2021

Efficient Online-Bandit Strategies for Minimax Learning Problems

Several learning problems involve solving min-max problems, e.g., empiri...

Please sign up or login with your details

Forgot password? Click here to reset