DeepAI AI Chat
Log In Sign Up

A framework for reinforcement learning with autocorrelated actions

by   Marcin Szulc, et al.

The subject of this paper is reinforcement learning. Policies are considered here that produce actions based on states and random elements autocorrelated in subsequent time instants. Consequently, an agent learns from experiments that are distributed over time and potentially give better clues to policy improvement. Also, physical implementation of such policies, e.g. in robotics, is less problematic, as it avoids making robots shake. This is in opposition to most RL algorithms which add white noise to control causing unwanted shaking of the robots. An algorithm is introduced here that approximately optimizes the aforementioned policy. Its efficiency is verified for four simulated learning control problems (Ant, HalfCheetah, Hopper, and Walker2D) against three other methods (PPO, SAC, ACER). The algorithm outperforms others in three of these problems.


ACERAC: Efficient reinforcement learning in fine time discretization

We propose a framework for reinforcement learning (RL) in fine time disc...

Control with adaptive Q-learning

This paper evaluates adaptive Q-learning (AQL) and single-partition adap...

Learning to Play Soccer by Reinforcement and Applying Sim-to-Real to Compete in the Real World

This work presents an application of Reinforcement Learning (RL) for the...

Robust Predictable Control

Many of the challenges facing today's reinforcement learning (RL) algori...

Learning Task-Driven Control Policies via Information Bottlenecks

This paper presents a reinforcement learning approach to synthesizing ta...

Composable Energy Policies for Reactive Motion Generation and Reinforcement Learning

Reactive motion generation problems are usually solved by computing acti...

Moral reinforcement learning using actual causation

Reinforcement learning systems will to a greater and greater extent make...