A Geometric Perspective on Optimal Representations for Reinforcement Learning

01/31/2019
by   Marc G. Bellemare, et al.
10

This paper proposes a new approach to representation learning based on geometric properties of the space of value functions. We study a two-part approximation of the value function: a nonlinear map from states to vectors, or representation, followed by a linear map from vectors to values. Our formulation considers adapting the representation to minimize the (linear) approximation of the value function of all stationary policies for a given environment. We show that this optimization reduces to making accurate predictions regarding a special class of value functions which we call adversarial value functions (AVFs). We argue that these AVFs make excellent auxiliary tasks, and use them to construct a loss which can be efficiently minimized to find a near-optimal representation for reinforcement learning. We highlight characteristics of the method in a series of experiments on the four-room domain.

READ FULL TEXT

page 6

page 7

page 8

page 16

page 17

page 18

page 19

research
01/16/2019

Representation Learning on Graphs: A Reinforcement Learning Application

In this work, we study value function approximation in reinforcement lea...
research
08/22/2019

On Convergence Rate of Adaptive Multiscale Value Function Approximation For Reinforcement Learning

In this paper, we propose a generic framework for devising an adaptive a...
research
01/31/2019

The Value Function Polytope in Reinforcement Learning

We establish geometric and topological properties of the space of value ...
research
04/25/2023

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks

Auxiliary tasks improve the representations learned by deep reinforcemen...
research
06/27/2012

A compact, hierarchical Q-function decomposition

Previous work in hierarchical reinforcement learning has faced a dilemma...
research
03/01/2022

On the Generalization of Representations in Reinforcement Learning

In reinforcement learning, state representations are used to tractably d...
research
10/16/2012

Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs

Recently, Petrik et al. demonstrated that L1Regularized Approximate Line...

Please sign up or login with your details

Forgot password? Click here to reset