Model-free Representation Learning and Exploration in Low-rank MDPs

02/14/2021
by   Aditya Modi, et al.
0

The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning. With a known representation, several model-free exploration strategies exist. In contrast, all algorithms for the unknown representation setting are model-based, thereby requiring the ability to model the full dynamics. In this work, we present the first model-free representation learning algorithms for low rank MDPs. The key algorithmic contribution is a new minimax representation learning objective, for which we provide variants with differing tradeoffs in their statistical and computational properties. We interleave this representation learning step with an exploration strategy to cover the state space in a reward-free manner. The resulting algorithms are provably sample efficient and can accommodate general function approximation to scale to complex environments.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/09/2021

Representation Learning for Online and Offline RL in Low-rank MDPs

This work studies the question of Representation Learning in RL: how can...
06/18/2020

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

In order to deal with the curse of dimensionality in reinforcement learn...
03/21/2021

Comments on Leo Breiman's paper 'Statistical Modeling: The Two Cultures' (Statistical Science, 2001, 16(3), 199-231)

Breiman challenged statisticians to think more broadly, to step into the...
11/22/2021

A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

Representation learning lies at the heart of the empirical success of de...
10/24/2021

DiffSRL: Learning Dynamic-aware State Representation for Deformable Object Control with Differentiable Simulator

Dynamic state representation learning is an important task in robot lear...
10/25/2020

XLVIN: eXecuted Latent Value Iteration Nets

Value Iteration Networks (VINs) have emerged as a popular method to inco...
05/22/2019

The Journey is the Reward: Unsupervised Learning of Influential Trajectories

Unsupervised exploration and representation learning become increasingly...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.