Vector Optimization with Stochastic Bandit Feedback

10/23/2021
by   Çağın Ararat, et al.
0

We introduce vector optimization problems with stochastic bandit feedback, which extends the best arm identification problem to vector-valued rewards. We consider K designs, with multi-dimensional mean reward vectors, which are partially ordered according to a polyhedral ordering cone C. This generalizes the concept of Pareto set in multi-objective optimization and allows different sets of preferences of decision-makers to be encoded by C. Different than prior work, we define approximations of the Pareto set based on direction-free covering and gap notions. We study the setting where an evaluation of each design yields a noisy observation of the mean reward vector. Under subgaussian noise assumption, we investigate the sample complexity of the naïve elimination algorithm in an (ϵ,δ)-PAC setting, where the goal is to identify an (ϵ,δ)-PAC Pareto set with the minimum number of design evaluations. In particular, we identify cone-dependent geometric conditions on the deviations of empirical reward vectors from their mean under which the Pareto front can be approximated accurately. We run experiments to verify our theoretical results and illustrate how C and sampling budget affect the Pareto set, returned (ϵ,δ)-PAC Pareto set and the success of identification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2022

Robust Pareto Set Identification with Contaminated Bandit Feedback

We consider the Pareto set identification (PSI) problem in multi-objecti...
research
05/31/2023

Pareto Front Identification with Regret Minimization

We consider Pareto front identification for linear bandits (PFILin) wher...
research
07/01/2023

Adaptive Algorithms for Relaxed Pareto Set Identification

In this paper we revisit the fixed-confidence identification of the Pare...
research
06/24/2020

Pareto Active Learning with Gaussian Processes and Adaptive Discretization

We consider the problem of optimizing a vector-valued objective function...
research
07/30/2020

A PAC algorithm in relative precision for bandit problem with costly sampling

This paper considers the problem of maximizing an expectation function o...
research
03/23/2019

Mechanism Design for Maximum Vectors

We consider the Maximum Vectors problem in a strategic setting. In the c...
research
03/05/2018

Costs and Rewards in Priced Timed Automata

We consider Pareto analysis of reachable states of multi-priced timed au...

Please sign up or login with your details

Forgot password? Click here to reset