Boosting for Online Convex Optimization

02/18/2021
by   Elad Hazan, et al.
0

We consider the decision-making framework of online convex optimization with a very large number of experts. This setting is ubiquitous in contextual and reinforcement learning problems, where the size of the policy class renders enumeration and search within the policy class infeasible. Instead, we consider generalizing the methodology of online boosting. We define a weak learning algorithm as a mechanism that guarantees multiplicatively approximate regret against a base class of experts. In this access model, we give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class. We consider both full and partial (a.k.a. bandit) information feedback models. We also give an analogous efficient boosting algorithm for the i.i.d. statistical setting. Our results simultaneously generalize online boosting and gradient boosting guarantees to contextual learning model, online convex optimization and bandit linear optimization settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2020

Online Boosting with Bandit Feedback

We consider the problem of online boosting for regression tasks, when on...
research
08/01/2022

Boosted Off-Policy Learning

We investigate boosted ensemble models for off-policy learning from logg...
research
04/14/2022

Gradient boosting for convex cone predict and optimize problems

Many problems in engineering and statistics involve both predictive fore...
research
03/02/2020

Online Agnostic Boosting via Regret Minimization

Boosting is a widely used machine learning approach based on the idea of...
research
05/19/2022

What killed the Convex Booster ?

A landmark negative result of Long and Servedio established a worst-case...
research
03/05/2012

Agnostic System Identification for Model-Based Reinforcement Learning

A fundamental problem in control is to learn a model of a system from ob...
research
07/12/2023

Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex Optimization

We study multi-product inventory control problems where a manager makes ...

Please sign up or login with your details

Forgot password? Click here to reset