Reinforcement Learning with Information-Theoretic Actuation

09/30/2021
by   Elliot Catt, et al.
16

Reinforcement Learning formalises an embodied agent's interaction with the environment through observations, rewards and actions. But where do the actions come from? Actions are often considered to represent something external, such as the movement of a limb, a chess piece, or more generally, the output of an actuator. In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model. This view is particularly well-suited for leveraging the recent advances in large sequence models as prior knowledge for multi-task reinforcement learning problems. Our main contribution in this work is to show how to augment the standard MDP formalism with a sequential notion of internal action using information-theoretic techniques, and that this leads to self-consistent definitions of both internal and external action value functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

Reinforcement Learning with Simple Sequence Priors

Everything else being equal, simpler models should be preferred over mor...
research
02/09/2023

An Information-Theoretic Analysis of Nonstationary Bandit Learning

In nonstationary bandit learning problems, the decision-maker must conti...
research
11/21/2019

Information-Theoretic Confidence Bounds for Reinforcement Learning

We integrate information-theoretic concepts into the design and analysis...
research
02/04/2019

The Natural Language of Actions

We introduce Act2Vec, a general framework for learning context-based act...
research
07/18/2020

Modulation of viability signals for self-regulatory control

We revisit the role of instrumental value as a driver of adaptive behavi...
research
12/16/2018

An Active Information Seeking Model for Goal-oriented Vision-and-Language Tasks

As Computer Vision algorithms move from passive analysis of pixels to ac...
research
04/20/2017

Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads

This paper addresses the problem of predicting popularity of comments in...

Please sign up or login with your details

Forgot password? Click here to reset