Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback

05/15/2022
by   Tianyi Lin, et al.
6

Motivated by applications to online learning in sparse estimation and Bayesian optimization, we consider the problem of online unconstrained nonsubmodular minimization with delayed costs in both full information and bandit feedback settings. In contrast to previous works on online unconstrained submodular minimization, we focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms in static and delayed scenarios. We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded. Key to our approach is the notion of (α, β)-regret and the extension of the generic convex relaxation model from <cit.>, the analysis of which is of independent interest. We conduct and showcase several simulation studies to demonstrate the efficacy of our algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2019

Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback

In this paper, we propose three online algorithms for submodular maximis...
research
07/23/2020

Online Boosting with Bandit Feedback

We consider the problem of online boosting for regression tasks, when on...
research
07/09/2018

Delayed Bandit Online Learning with Unknown Delays

This paper studies bandit learning problems with delayed feedback, which...
research
07/06/2018

Differentially Private Online Submodular Optimization

In this paper we develop the first algorithms for online submodular mini...
research
06/16/2023

Understanding the Role of Feedback in Online Learning with Switching Costs

In this paper, we study the role of feedback in online learning with swi...
research
10/11/2022

Trading Off Resource Budgets for Improved Regret Bounds

In this work we consider a variant of adversarial online learning where ...
research
10/18/2021

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

In the fixed budget thresholding bandit problem, an algorithm sequential...

Please sign up or login with your details

Forgot password? Click here to reset