Blind Decision Making: Reinforcement Learning with Delayed Observations

11/16/2020
by   Mridul Agarwal, et al.
0

Reinforcement learning typically assumes that the state update from the previous actions happens instantaneously, and thus can be used for making future decisions. However, this may not always be true. When the state update is not available, the decision taken is partly in the blind since it cannot rely on the current state information. This paper proposes an approach, where the delay in the knowledge of the state can be used, and the decisions are made based on the available information which may not include the current state information. One approach could be to include the actions after the last-known state as a part of the state information, however, that leads to an increased state-space making the problem complex and slower in convergence. The proposed algorithm gives an alternate approach where the state space is not enlarged, as compared to the case when there is no delay in the state update. Evaluations on the basic RL environments further illustrate the improved performance of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2020

A Gentle Lecture Note on Filtrations in Reinforcement Learning

This note aims to provide a basic intuition on the concept of filtration...
research
02/23/2020

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations

Deep reinforcement learning is successful in decision making for sophist...
research
02/08/2018

Learning and Querying Fast Generative Models for Reinforcement Learning

A key challenge in model-based reinforcement learning (RL) is to synthes...
research
05/18/2021

Learning and Information in Stochastic Networks and Queues

We review the role of information and learning in the stability and opti...
research
07/24/2020

Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies

Standard reinforcement learning (RL) aims to find an optimal policy that...
research
07/01/2022

Offline Policy Optimization with Eligible Actions

Offline policy optimization could have a large impact on many real-world...
research
01/27/2020

Reinforcement Learning-based Autoscaling of Workflows in the Cloud: A Survey

Reinforcement Learning (RL) has demonstrated a great potential for autom...

Please sign up or login with your details

Forgot password? Click here to reset