Action Redundancy in Reinforcement Learning

02/22/2021
by   Nir Baram, et al.
0

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization. However, action entropy does not necessarily coincide with state entropy, e.g., when multiple actions produce the same transition. Instead, we propose to maximize the transition entropy, i.e., the entropy of next states. We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy. Particularly, we explore the latter in both deterministic and stochastic settings and develop tractable approximation methods in a near model-free setup. We construct algorithms to minimize action redundancy and demonstrate their effectiveness on a synthetic environment with multiple redundant actions as well as contemporary benchmarks in Atari and Mujoco. Our results suggest that action redundancy is a fundamental problem in reinforcement learning.

READ FULL TEXT
research
06/10/2018

Implicit Policy for Reinforcement Learning

We introduce Implicit Policy, a general class of expressive policies tha...
research
12/28/2018

Dynamic Planning Networks

We introduce Dynamic Planning Networks (DPN), a novel architecture for d...
research
05/07/2021

Utilizing Skipped Frames in Action Repeats via Pseudo-Actions

In many deep reinforcement learning settings, when an agent takes an act...
research
12/11/2019

Marginalized State Distribution Entropy Regularization in Policy Optimization

Entropy regularization is used to get improved optimization performance ...
research
06/19/2021

A Max-Min Entropy Framework for Reinforcement Learning

In this paper, we propose a max-min entropy framework for reinforcement ...
research
04/20/2017

Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads

This paper addresses the problem of predicting popularity of comments in...
research
04/14/2020

Extrapolation in Gridworld Markov-Decision Processes

Extrapolation in reinforcement learning is the ability to generalize at ...

Please sign up or login with your details

Forgot password? Click here to reset