Pareto Front Identification with Regret Minimization

05/31/2023
by   Wonyoung Kim, et al.
0

We consider Pareto front identification for linear bandits (PFILin) where the goal is to identify a set of arms whose reward vectors are not dominated by any of the others when the mean reward vector is a linear function of the context. PFILin includes the best arm identification problem and multi-objective active learning as special cases. The sample complexity of our proposed algorithm is Õ(d/Δ^2), where d is the dimension of contexts and Δ is a measure of problem complexity. Our sample complexity is optimal up to a logarithmic factor. A novel feature of our algorithm is that it uses the contexts of all actions. In addition to efficiently identifying the Pareto front, our algorithm also guarantees Õ(√(d/t)) bound for instantaneous Pareto regret when the number of samples is larger than Ω(dlog dL) for L dimensional vector rewards. By using the contexts of all arms, our proposed algorithm simultaneously provides efficient Pareto front identification and regret minimization. Numerical experiments demonstrate that the proposed algorithm successfully identifies the Pareto front while minimizing the regret.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2022

Robust Pareto Set Identification with Contaminated Bandit Feedback

We consider the Pareto set identification (PSI) problem in multi-objecti...
research
07/01/2023

Adaptive Algorithms for Relaxed Pareto Set Identification

In this paper we revisit the fixed-confidence identification of the Pare...
research
10/23/2021

Vector Optimization with Stochastic Bandit Feedback

We introduce vector optimization problems with stochastic bandit feedbac...
research
06/24/2020

Pareto Active Learning with Gaussian Processes and Adaptive Discretization

We consider the problem of optimizing a vector-valued objective function...
research
05/30/2019

Multi-Objective Generalized Linear Bandits

In this paper, we study the multi-objective bandits (MOB) problem, where...
research
10/16/2021

On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits

We study the Pareto frontier of two archetypal objectives in stochastic ...
research
03/05/2018

Costs and Rewards in Priced Timed Automata

We consider Pareto analysis of reachable states of multi-priced timed au...

Please sign up or login with your details

Forgot password? Click here to reset