Learning under Invariable Bayesian Safety

06/08/2020
by   Gal Bahar, et al.
7

A recent body of work addresses safety constraints in explore-and-exploit systems. Such constraints arise where, for example, exploration is carried out by individuals whose welfare should be balanced with overall welfare. In this paper, we adopt a model inspired by recent work on a bandit-like setting for recommendations. We contribute to this line of literature by introducing a safety constraint that should be respected in every round and determines that the expected value in each round is above a given threshold. Due to our modeling, the safe explore-and-exploit policy deserves careful planning, or otherwise, it will lead to sub-optimal welfare. We devise an asymptotically optimal algorithm for the setting and analyze its instance-dependent convergence rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2019

Safe Linear Thompson Sampling

The design and performance analysis of bandit algorithms in the presence...
research
08/01/2019

The Constrained Round Robin Algorithm for Fair and Efficient Allocation

We consider a multi-agent resource allocation setting that models the as...
research
07/01/2021

Asymptotically Optimal Welfare of Posted Pricing for Multiple Items with MHR Distributions

We consider the problem of posting prices for unit-demand buyers if all ...
research
05/05/2020

Regret Bounds for Safe Gaussian Process Bandit Optimization

Many applications require a learner to make sequential decisions given u...
research
05/27/2022

Safety Aware Changepoint Detection for Piecewise i.i.d. Bandits

In this paper, we consider the setting of piecewise i.i.d. bandits under...
research
01/30/2021

Recurrent Submodular Welfare and Matroid Blocking Bandits

A recent line of research focuses on the study of the stochastic multi-a...
research
07/19/2023

Absolutist AI

This paper argues that training AI systems with absolute constraints – w...

Please sign up or login with your details

Forgot password? Click here to reset