Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

04/15/2020
by   Colin Bellinger, et al.
0

Reinforcement learning (RL) has been demonstrated to have great potential in many applications of scientific discovery and design. Recent work includes, for example, the design of new structures and compositions of molecules for therapeutic drugs. Much of the existing work related to the application of RL to scientific domains, however, assumes that the available state representation obeys the Markov property. For reasons associated with time, cost, sensor accuracy, and gaps in scientific knowledge, many scientific design and discovery problems do not satisfy the Markov property. Thus, something other than a Markov decision process (MDP) should be used to plan / find the optimal policy. In this paper, we present a physics-inspired semi-Markov RL environment, namely the phase change environment. In addition, we evaluate the performance of value-based RL algorithms for both MDPs and partially observable MDPs (POMDPs) on the proposed environment. Our results demonstrate deep recurrent Q-networks (DRQN) significantly outperform deep Q-networks (DQN), and that DRQNs benefit from training with hindsight experience replay. Implications for the use of semi-Markovian RL and POMDPs for scientific laboratories are also discussed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2022

Semi-Markov Offline Reinforcement Learning for Healthcare

Reinforcement learning (RL) tasks are typically framed as Markov Decisio...
research
01/02/2023

On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

Drug dosing is an important application of AI, which can be formulated a...
research
05/01/2022

Processing Network Controls via Deep Reinforcement Learning

Novel advanced policy gradient (APG) algorithms, such as proximal policy...
research
02/20/2021

Importance of Environment Design in Reinforcement Learning: A Study of a Robotic Environment

An in-depth understanding of the particular environment is crucial in re...
research
08/23/2021

A generalized stacked reinforcement learning method for sampled systems

A common setting of reinforcement learning (RL) is a Markov decision pro...
research
12/06/2021

MDPFuzzer: Finding Crash-Triggering State Sequences in Models Solving the Markov Decision Process

The Markov decision process (MDP) provides a mathematical framework for ...
research
05/11/2020

TOMA: Topological Map Abstraction for Reinforcement Learning

Animals are able to discover the topological map (graph) of surrounding ...

Please sign up or login with your details

Forgot password? Click here to reset