Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics

07/04/2021
by   Nicolò Botteghi, et al.
5

Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. However, in end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. In this work, we proposed a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one. Moreover, we seek to find the optimal policy mapping latent states to latent actions. Because now the policy is learned on abstract representations, we enforce, using auxiliary loss functions, the lifting of such policy to the original problem domain. Results show that the novel framework can efficiently learn low-dimensional and interpretable state and action representations and the optimal latent policy.

READ FULL TEXT

page 10

page 11

page 14

research
11/16/2020

Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

We present a hierarchical planning and control framework that enables an...
research
09/17/2021

Efficient State Representation Learning for Dynamic Robotic Scenarios

While the rapid progress of deep learning fuels end-to-end reinforcement...
research
12/29/2022

On the Geometry of Reinforcement Learning in Continuous State and Action Spaces

Advances in reinforcement learning have led to its successful applicatio...
research
10/31/2022

Disentangled (Un)Controllable Features

In the context of MDPs with high-dimensional states, reinforcement learn...
research
10/12/2022

Reinforcement Learning with Automated Auxiliary Loss Search

A good state representation is crucial to solving complicated reinforcem...
research
05/04/2021

On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning

The lottery ticket hypothesis questions the role of overparameterization...
research
07/13/2020

DinerDash Gym: A Benchmark for Policy Learning in High-Dimensional Action Space

It has been arduous to assess the progress of a policy learning algorith...

Please sign up or login with your details

Forgot password? Click here to reset