Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

12/03/2019
by   Samuel Ainsworth, et al.
0

In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of convergence with a small asymptotic sub-optimality gap. We analyze the regret behavior of e-stops and present empirical results in discrete and continuous settings demonstrating that our reset mechanism can provide order-of-magnitude speedups on top of existing reinforcement learning methods.

READ FULL TEXT
research
11/13/2019

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

We present an algorithm, HOMER, for exploration and reinforcement learni...
research
11/20/2019

Bayesian Curiosity for Efficient Exploration in Reinforcement Learning

Balancing exploration and exploitation is a fundamental part of reinforc...
research
11/30/2022

Efficient Reinforcement Learning (ERL): Targeted Exploration Through Action Saturation

Reinforcement Learning (RL) generally suffers from poor sample complexit...
research
03/07/2021

The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning

Some reinforcement learning methods suffer from high sample complexity c...
research
10/15/2020

Optimal Dispatch in Emergency Service System via Reinforcement Learning

In the United States, medical responses by fire departments over the las...
research
05/25/2023

Sample Efficient Reinforcement Learning in Mixed Systems through Augmented Samples and Its Applications to Queueing Networks

This paper considers a class of reinforcement learning problems, which i...
research
01/01/2022

Transfer RL across Observation Feature Spaces via Model-Based Regularization

In many reinforcement learning (RL) applications, the observation space ...

Please sign up or login with your details

Forgot password? Click here to reset