Learning to Act Greedily: Polymatroid Semi-Bandits

05/30/2014
by   Branislav Kveton, et al.
0

Many important optimization problems, such as the minimum spanning tree and minimum-cost flow, can be solved optimally by a greedy method. In this work, we study a learning variant of these problems, where the model of the problem is unknown and has to be learned by interacting repeatedly with the environment in the bandit setting. We formalize our learning problem quite generally, as learning how to maximize an unknown modular function on a known polymatroid. We propose a computationally efficient algorithm for solving our problem and bound its expected cumulative regret. Our gap-dependent upper bound is tight up to a constant and our gap-free upper bound is tight up to polylogarithmic factors. Finally, we evaluate our method on three problems and demonstrate that it is practical.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2014

Matroid Bandits: Fast Combinatorial Optimization with Learning

A matroid is a notion of independence in combinatorial optimization whic...
research
10/03/2014

Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits

A stochastic combinatorial semi-bandit is an online learning problem whe...
research
05/27/2019

Scalable K-Medoids via True Error Bound and Familywise Bandits

K-Medoids(KM) is a standard clustering method, used extensively on semi-...
research
02/20/2023

Achieving Hierarchy-Free Approximation for Bilevel Programs With Equilibrium Constraints

In this paper, we develop an approximation scheme for solving bilevel pr...
research
02/09/2021

Nonstochastic Bandits with Infinitely Many Experts

We study the problem of nonstochastic bandits with infinitely many exper...
research
02/19/2023

Estimating Optimal Policy Value in General Linear Contextual Bandits

In many bandit problems, the maximal reward achievable by a policy is of...
research
12/02/2021

Convergence Guarantees for Deep Epsilon Greedy Policy Learning

Policy learning is a quickly growing area. As robotics and computers con...

Please sign up or login with your details

Forgot password? Click here to reset