Reinforcement Learning with Simple Sequence Priors

05/26/2023
by   Tankred Saanum, et al.
0

Everything else being equal, simpler models should be preferred over more complex ones. In reinforcement learning (RL), simplicity is typically quantified on an action-by-action basis – but this timescale ignores temporal regularities, like repetitions, often present in sequential strategies. We therefore propose an RL algorithm that learns to solve tasks with sequences of actions that are compressible. We explore two possible sources of simple action sequences: Sequences that can be learned by autoregressive models, and sequences that are compressible with off-the-shelf data compression algorithms. Distilling these preferences into sequence priors, we derive a novel information-theoretic objective that incentivizes agents to learn policies that maximize rewards while conforming to these priors. We show that the resulting RL algorithm leads to faster learning, and attains higher returns than state-of-the-art model-free approaches in a series of continuous control tasks from the DeepMind Control Suite. These priors also produce a powerful information-regularized agent that is robust to noisy observations and can perform open-loop control.

READ FULL TEXT
research
03/23/2019

TTR-Based Rewards for Reinforcement Learning with Implicit Model Priors

Model-free reinforcement learning (RL) provides an attractive approach f...
research
09/30/2021

Reinforcement Learning with Information-Theoretic Actuation

Reinforcement Learning formalises an embodied agent's interaction with t...
research
10/20/2021

More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences

Incorporating prior knowledge in reinforcement learning algorithms is ma...
research
02/07/2021

An Analysis of Frame-skipping in Reinforcement Learning

In the practice of sequential decision making, agents are often designed...
research
07/18/2020

Modulation of viability signals for self-regulatory control

We revisit the role of instrumental value as a driver of adaptive behavi...
research
06/08/2021

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning

We propose to learn to distinguish reversible from irreversible actions ...
research
12/02/2021

Residual Pathway Priors for Soft Equivariance Constraints

There is often a trade-off between building deep learning systems that a...

Please sign up or login with your details

Forgot password? Click here to reset