Utilizing Skipped Frames in Action Repeats via Pseudo-Actions

05/07/2021
by   Taisei Hashimoto, et al.
0

In many deep reinforcement learning settings, when an agent takes an action, it repeats the same action a predefined number of times without observing the states until the next action-decision point. This technique of action repetition has several merits in training the agent, but the data between action-decision points (i.e., intermediate frames) are, in effect, discarded. Since the amount of training data is inversely proportional to the interval of action repeats, they can have a negative impact on the sample efficiency of training. In this paper, we propose a simple but effective approach to alleviate to this problem by introducing the concept of pseudo-actions. The key idea of our method is making the transition between action-decision points usable as training data by considering pseudo-actions. Pseudo-actions for continuous control tasks are obtained as the average of the action sequence straddling an action-decision point. For discrete control tasks, pseudo-actions are computed from learned action embeddings. This method can be combined with any model-free reinforcement learning algorithm that involves the learning of Q-functions. We demonstrate the effectiveness of our approach on both continuous and discrete control tasks in OpenAI Gym.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2017

Action Branching Architectures for Deep Reinforcement Learning

Discrete-action algorithms have been central to numerous recent successe...
research
01/18/2021

Deep Reinforcement Learning with Embedded LQR Controllers

Reinforcement learning is a model-free optimal control method that optim...
research
09/05/2015

Reinforcement Learning with Parameterized Actions

We introduce a model-free algorithm for learning in Markov decision proc...
research
02/22/2021

Action Redundancy in Reinforcement Learning

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning p...
research
06/13/2023

Dynamic Interval Restrictions on Action Spaces in Deep Reinforcement Learning for Obstacle Avoidance

Deep reinforcement learning algorithms typically act on the same set of ...
research
11/26/2022

How Crucial is Transformer in Decision Transformer?

Decision Transformer (DT) is a recently proposed architecture for Reinfo...
research
04/24/2023

Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics

A common assumption when training embodied agents is that the impact of ...

Please sign up or login with your details

Forgot password? Click here to reset