Provably Safe PAC-MDP Exploration Using Analogies

07/07/2020
by   Melrose Roderick, et al.
0

A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure). Although a growing line of work in reinforcement learning has investigated this area of "safe exploration," most existing techniques either 1) do not guarantee safety during the actual exploration process; and/or 2) limit the problem to a priori known and/or deterministic transition dynamics with strong smoothness assumptions. Addressing this gap, we propose Analogous Safe-state Exploration (ASE), an algorithm for provably safe exploration in MDPs with unknown, stochastic dynamics. Our method exploits analogies between state-action pairs to safely learn a near-optimal policy in a PAC-MDP sense. Additionally, ASE also guides exploration towards the most task-relevant states, which empirically results in significant improvements in terms of sample efficiency, when compared to existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2022

Guiding Safe Exploration with Weakest Preconditions

In reinforcement learning for safety-critical settings, it is often desi...
research
07/29/2022

Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions

Reinforcement Learning (RL) and continuous nonlinear control have been s...
research
10/30/2019

Safe Exploration for Interactive Machine Learning

In Interactive Machine Learning (IML), we iteratively make decisions and...
research
04/01/2019

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

We propose a safe exploration algorithm for deterministic Markov Decisio...
research
07/16/2018

Shielded Decision-Making in MDPs

A prominent problem in artificial intelligence and machine learning is t...
research
10/26/2022

Provable Safe Reinforcement Learning with Binary Feedback

Safety is a crucial necessity in many applications of reinforcement lear...
research
10/25/2021

Safely Bridging Offline and Online Reinforcement Learning

A key challenge to deploying reinforcement learning in practice is explo...

Please sign up or login with your details

Forgot password? Click here to reset