Optimal Stochastic Nonconvex Optimization with Bandit Feedback

03/30/2021
by   Puning Zhao, et al.
0

In this paper, we analyze the continuous armed bandit problems for nonconvex cost functions under certain smoothness and sublevel set assumptions. We first derive an upper bound on the expected cumulative regret of a simple bin splitting method. We then propose an adaptive bin splitting method, which can significantly improve the performance. Furthermore, a minimax lower bound is derived, which shows that our new adaptive method achieves locally minimax optimal expected cumulative regret.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

09/23/2021

Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual Linear Bandit

In this paper, we consider the multi-armed bandit problem with high-dime...
06/08/2015

Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem

We study the K-armed dueling bandit problem, a variation of the standard...
06/03/2021

Bandit Phase Retrieval

We study a bandit version of phase retrieval where the learner chooses a...
02/10/2022

Adaptively Exploiting d-Separators with Causal Bandits

Multi-armed bandit problems provide a framework to identify the optimal ...
02/14/2012

Towards minimax policies for online linear optimization with bandit feedback

We address the online linear optimization problem with bandit feedback. ...
10/09/2018

Adaptive Minimax Regret against Smooth Logarithmic Losses over High-Dimensional ℓ_1-Balls via Envelope Complexity

We develop a new theoretical framework, the envelope complexity, to anal...
05/10/2022

Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization

The expected improvement (EI) is one of the most popular acquisition fun...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.