Better Safe than Sorry: Evidence Accumulation Allows for Safe Reinforcement Learning

09/24/2018
by   Akshat Agarwal, et al.
0

In the real world, agents often have to operate in situations with incomplete information, limited sensing capabilities, and inherently stochastic environments, making individual observations incomplete and unreliable. Moreover, in many situations it is preferable to delay a decision rather than run the risk of making a bad decision. In such situations it is necessary to aggregate information before taking an action; however, most state of the art reinforcement learning (RL) algorithms are biased towards taking actions at every time step, even if the agent is not particularly confident in its chosen action. This lack of caution can lead the agent to make critical mistakes, regardless of prior experience and acclimation to the environment. Motivated by theories of dynamic resolution of uncertainty during decision making in biological brains, we propose a simple accumulator module which accumulates evidence in favor of each possible decision, encodes uncertainty as a dynamic competition between actions, and acts on the environment only when it is sufficiently confident in the chosen action. The agent makes no decision by default, and the burden of proof to make a decision falls on the policy to accrue evidence strongly in favor of a single decision. Our results show that this accumulator module achieves near-optimal performance on a simple guessing game, far outperforming deep recurrent networks using traditional, forced action selection policies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2020

Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

Reinforcement learning (RL) can be used to create a tactical decision-ma...
research
06/17/2020

Reinforcement Learning with Uncertainty Estimation for Tactical Decision-Making in Intersections

This paper investigates how a Bayesian reinforcement learning method can...
research
04/02/2023

Risk-Sensitive and Robust Model-Based Reinforcement Learning and Planning

Many sequential decision-making problems that are currently automated, s...
research
09/27/2022

Collaborative Decision Making Using Action Suggestions

The level of autonomy is increasing in systems spanning multiple domains...
research
10/04/2019

"I'm sorry Dave, I'm afraid I can't do that" Deep Q-learning from forbidden action

The use of Reinforcement Learning (RL) is still restricted to simulation...
research
10/30/2019

Dynamically Protecting Privacy, under Uncertainty

We propose and analyze the ε-Noisy Goal Prediction Game to study a funda...
research
08/29/2017

Safe Reinforcement Learning via Shielding

Reinforcement learning algorithms discover policies that maximize reward...

Please sign up or login with your details

Forgot password? Click here to reset