Compress and Control

11/19/2014
by   Joel Veness, et al.
0

This paper describes a new information-theoretic policy evaluation technique for reinforcement learning. This technique converts any compression or density model into a corresponding estimate of value. Under appropriate stationarity and ergodicity conditions, we show that the use of a sufficiently powerful model gives rise to a consistent value function estimator. We also study the behavior of this technique when applied to various Atari 2600 video games, where the use of suboptimal modeling techniques is unavoidable. We consider three fundamentally different models, all too limited to perfectly model the dynamics of the system. Remarkably, we find that our technique provides sufficiently accurate value estimates for effective on-policy control. We conclude with a suggestive study highlighting the potential of our technique to scale to large problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2020

Inverse Policy Evaluation for Value-based Sequential Decision-making

Value-based methods for reinforcement learning lack generally applicable...
research
05/29/2022

Representation Gap in Deep Reinforcement Learning

Deep reinforcement learning gives the promise that an agent learns good ...
research
02/11/2021

Echo State Networks for Reinforcement Learning

Echo State Networks (ESNs) are a type of single-layer recurrent neural n...
research
12/31/2019

Information Theoretic Model Predictive Q-Learning

Model-free Reinforcement Learning (RL) algorithms work well in sequentia...
research
06/04/2022

Between Rate-Distortion Theory Value Equivalence in Model-Based Reinforcement Learning

The quintessential model-based reinforcement-learning agent iteratively ...
research
06/09/2019

SVRG for Policy Evaluation with Fewer Gradient Evaluations

Stochastic variance-reduced gradient (SVRG) is an optimization method or...
research
11/18/2019

Gamma-Nets: Generalizing Value Estimation over Timescale

We present Γ-nets, a method for generalizing value function estimation o...

Please sign up or login with your details

Forgot password? Click here to reset