PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits

10/25/2022
by   Kyoungseok Jang, et al.
0

In sparse linear bandits, a learning agent sequentially selects an action and receive reward feedback, and the reward function depends linearly on a few coordinates of the covariates of the actions. This has applications in many real-world sequential decision making problems. In this paper, we propose a simple and computationally efficient sparse linear estimation method called PopArt that enjoys a tighter ℓ_1 recovery guarantee compared to Lasso (Tibshirani, 1996) in many problems. Our bound naturally motivates an experimental design criterion that is convex and thus computationally efficient to solve. Based on our novel estimator and design criterion, we derive sparse linear bandit algorithms that enjoy improved regret upper bounds upon the state of the art (Hao et al., 2020), especially w.r.t. the geometry of the given action set. Finally, we prove a matching lower bound for sparse linear bandits in the data-poor regime, which closes the gap between upper and lower bounds in prior work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2020

Tight Lower Bounds for Combinatorial Multi-Armed Bandits

The Combinatorial Multi-Armed Bandit problem is a sequential decision-ma...
research
10/30/2022

Revisiting Simple Regret Minimization in Multi-Armed Bandits

Simple regret is a natural and parameter-free performance criterion for ...
research
11/08/2020

High-Dimensional Sparse Linear Bandits

Stochastic linear bandits with high-dimensional sparse features are a pr...
research
03/18/2022

The price of unfairness in linear bandits with biased feedback

Artificial intelligence is increasingly used in a wide range of decision...
research
10/22/2020

Thresholded LASSO Bandit

In this paper, we revisit sparse stochastic contextual linear bandits. I...
research
11/16/2022

Dynamical Linear Bandits

In many real-world sequential decision-making problems, an action does n...
research
03/20/2014

Matroid Bandits: Fast Combinatorial Optimization with Learning

A matroid is a notion of independence in combinatorial optimization whic...

Please sign up or login with your details

Forgot password? Click here to reset