Regret Bounds for Safe Gaussian Process Bandit Optimization

05/05/2020
by   Sanae Amani, et al.
0

Many applications require a learner to make sequential decisions given uncertainty regarding both the system's payoff function and safety constraints. In safety-critical systems, it is paramount that the learner's actions do not violate the safety constraints at any stage of the learning process. In this paper, we study a stochastic bandit optimization problem where the unknown payoff and constraint functions are sampled from Gaussian Processes (GPs) first considered in [Srinivas et al., 2010]. We develop a safe variant of GP-UCB called SGP-UCB, with necessary modifications to respect safety constraints at every round. The algorithm has two distinct phases. The first phase seeks to estimate the set of safe actions in the decision set, while the second phase follows the GP-UCB decision rule. Our main contribution is to derive the first sub-linear regret bounds for this problem. We numerically compare SGP-UCB against existing safe Bayesian GP optimization algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2019

Linear Stochastic Bandits Under Safety Constraints

Bandit algorithms have various application in safety-critical systems, w...
research
11/06/2019

Safe Linear Thompson Sampling

The design and performance analysis of bandit algorithms in the presence...
research
11/03/2022

Benefits of Monotonicity in Safe Exploration with Gaussian Processes

We consider the problem of sequentially maximising an unknown function o...
research
11/10/2022

Adaptive Real Time Exploration and Optimization for Safety-Critical Systems

We consider the problem of decision-making under uncertainty in an envir...
research
06/13/2019

Robust Regression for Safe Exploration in Control

We study the problem of safe learning and exploration in sequential cont...
research
12/09/2022

Information-Theoretic Safe Exploration with Gaussian Processes

We consider a sequential decision making task where we are not allowed t...
research
06/08/2020

Learning under Invariable Bayesian Safety

A recent body of work addresses safety constraints in explore-and-exploi...

Please sign up or login with your details

Forgot password? Click here to reset