Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

04/01/2019
by   Erdem Bıyık, et al.
0

We propose a safe exploration algorithm for deterministic Markov Decision Processes with unknown transition models. Our algorithm guarantees safety by leveraging Lipschitz-continuity to ensure that no unsafe states are visited during exploration. Unlike many other existing techniques, the provided safety guarantee is deterministic. Our algorithm is optimized to reduce the number of actions needed for exploring the safe space. We demonstrate the performance of our algorithm in comparison with baseline methods in simulation on navigation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2018

Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

In many real-world applications (e.g., planetary exploration, robot navi...
research
06/15/2016

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

In classical reinforcement learning, when exploring an environment, agen...
research
10/30/2019

Safe Exploration for Interactive Machine Learning

In Interactive Machine Learning (IML), we iteratively make decisions and...
research
06/25/2019

SOS: Safe, Optimal and Small Strategies for Hybrid Markov Decision Processes

For hybrid Markov decision processes, UPPAAL Stratego can compute strate...
research
10/01/2020

Learning to be safe, in finite time

This paper aims to put forward the concept that learning to take safe ac...
research
06/28/2019

L*-Based Learning of Markov Decision Processes (Extended Version)

Automata learning techniques automatically generate system models from t...
research
07/07/2020

Provably Safe PAC-MDP Exploration Using Analogies

A key challenge in applying reinforcement learning to safety-critical do...

Please sign up or login with your details

Forgot password? Click here to reset