Real-Time Reinforcement Learning

11/11/2019
by   Simon Ramstedt, et al.
45

Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action selection. As RL systems based on MDPs begin to find application in real-world safety critical situations, this mismatch between the assumptions underlying classical MDPs and the reality of real-time computation may lead to undesirable outcomes. In this paper, we introduce a new framework, in which states and actions evolve simultaneously and show how it is related to the classical MDP formulation. We analyze existing algorithms under the new real-time formulation and show why they are suboptimal when used in real-time. We then use those insights to create a new algorithm Real-Time Actor-Critic (RTAC) that outperforms the existing state-of-the-art continuous control algorithm Soft Actor-Critic both in real-time and non-real-time settings. Code and videos can be found at https://github.com/rmst/rtrl.

READ FULL TEXT

page 7

page 11

page 12

research
06/04/2017

Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics

In many robotic applications, some aspects of the system dynamics can be...
research
09/23/2019

Modular Deep Reinforcement Learning with Temporal Logic Specifications

We propose an actor-critic, model-free, and online Reinforcement Learnin...
research
04/28/2022

Actor-Critic Scheduling for Path-Aware Air-to-Ground Multipath Multimedia Delivery

Reinforcement Learning (RL) has recently found wide applications in netw...
research
12/26/2018

Deconfounding Reinforcement Learning in Observational Settings

We propose a general formulation for addressing reinforcement learning (...
research
03/19/2022

Reinforcement learning for automatic quadrilateral mesh generation: a soft actor-critic approach

This paper proposes, implements, and evaluates a reinforcement learning ...
research
09/11/2019

Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning

Cumulative entropy regularization introduces a regulatory signal to the ...
research
10/05/2020

Using Soft Actor-Critic for Low-Level UAV Control

Unmanned Aerial Vehicles (UAVs), or drones, have recently been used in s...

Please sign up or login with your details

Forgot password? Click here to reset