DeepAI
Log In Sign Up

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach

01/31/2022
by   Xuezhou Zhang, et al.
10

We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics (i.e., Block MDPs), where rich observations are generated from a set of unknown latent states. BRIEE interleaves latent states discovery, exploration, and exploitation together, and can provably learn a near-optimal policy with sample complexity scaling polynomially in the number of latent states, actions, and the time horizon, with no dependence on the size of the potentially infinite observation space. Empirically, we show that BRIEE is more sample efficient than the state-of-art Block MDP algorithm HOMER and other empirical RL baselines on challenging rich-observation combination lock problems that require deep exploration.

READ FULL TEXT
11/13/2019

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

We present an algorithm, HOMER, for exploration and reinforcement learni...
11/11/2016

Reinforcement Learning in Rich-Observation MDPs using Spectral Methods

Designing effective exploration-exploitation algorithms in Markov decisi...
03/15/2020

Provably Efficient Exploration for RL with Unsupervised Learning

We study how to use unsupervised learning for efficient exploration in r...
10/17/2021

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

Many real-world applications of reinforcement learning (RL) require the ...
08/17/2022

Nearly Optimal Latent State Decoding in Block MDPs

We investigate the problems of model estimation and reward-free learning...
01/25/2019

Provably efficient RL with Rich Observations via Latent State Decoding

We study the exploration problem in episodic MDPs with rich observations...
06/22/2021

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

There have been many recent advances on provably efficient Reinforcement...