Active Learning with Safety Constraints

06/22/2022
by   Romain Camilleri, et al.
0

Active learning methods have shown great promise in reducing the number of samples necessary for learning. As automated learning systems are adopted into real-time, real-world decision-making pipelines, it is increasingly important that such algorithms are designed with safety in mind. In this work we investigate the complexity of learning the best safe decision in interactive environments. We reduce this problem to a constrained linear bandits problem, where our goal is to find the best arm satisfying certain (unknown) safety constraints. We propose an adaptive experimental design-based algorithm, which we show efficiently trades off between the difficulty of showing an arm is unsafe vs suboptimal. To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints. In practice, we demonstrate that this approach performs well on synthetic and real world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2021

Best Arm Identification with Safety Constraints

The best arm identification problem in the multi-armed bandit setting is...
research
09/15/2023

Price of Safety in Linear Best Arm Identification

We introduce the safe best-arm identification framework with linear feed...
research
06/10/2022

Interactively Learning Preference Constraints in Linear Bandits

We study sequential decision-making with known rewards and unknown const...
research
06/25/2021

Active Learning in Robotics: A Review of Control Principles

Active learning is a decision-making process. In both abstract and physi...
research
07/27/2023

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

We investigate the fixed-budget best-arm identification (BAI) problem fo...
research
12/15/2020

Generalized Chernoff Sampling for Active Learning and Structured Bandit Algorithms

Active learning and structured stochastic bandit problems are intimately...
research
10/16/2017

Fully adaptive algorithm for pure exploration in linear bandits

We propose the first fully-adaptive algorithm for pure exploration in li...

Please sign up or login with your details

Forgot password? Click here to reset