Safe Exploration for Interactive Machine Learning

by   Matteo Turchetta, et al.

In Interactive Machine Learning (IML), we iteratively make decisions and obtain noisy observations of an unknown function. While IML methods, e.g., Bayesian optimization and active learning, have been successful in applications, on real-world systems they must provably avoid unsafe decisions. To this end, safe IML algorithms must carefully learn about a priori unknown constraints without making unsafe decisions. Existing algorithms for this problem learn about the safety of all decisions to ensure convergence. This is sample-inefficient, as it explores decisions that are not relevant for the original IML objective. In this paper, we introduce a novel framework that renders any existing unsafe IML algorithm safe. Our method works as an add-on that takes suggested decisions as input and exploits regularity assumptions in terms of a Gaussian process prior in order to efficiently learn about their safety. As a result, we only explore the safe set when necessary for the IML problem. We apply our framework to safe Bayesian optimization and to safe exploration in deterministic Markov Decision Processes (MDP), which have been analyzed separately before. Our method outperforms other algorithms empirically.



There are no comments yet.


page 1

page 2

page 3

page 4


Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

In classical reinforcement learning, when exploring an environment, agen...

Provably Safe PAC-MDP Exploration Using Analogies

A key challenge in applying reinforcement learning to safety-critical do...

Stagewise Safe Bayesian Optimization with Gaussian Processes

Enforcing safety is a key aspect of many problems pertaining to sequenti...

Verifying Controllers Against Adversarial Examples with Bayesian Optimization

Recent successes in reinforcement learning have lead to the development ...

Safe Learning and Optimization Techniques: Towards a Survey of the State of the Art

Safe learning and optimization deals with learning and optimization prob...

Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

In many real-world applications (e.g., planetary exploration, robot navi...

Unscented Bayesian Optimization for Safe Robot Grasping

We address the robot grasp optimization problem of unknown objects consi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.