Unsupervised Curricula for Visual Meta-Reinforcement Learning

12/09/2019
by   Allan Jabri, et al.
7

In principle, meta-reinforcement learning algorithms leverage experience across many tasks to learn fast reinforcement learning (RL) strategies that transfer to similar tasks. However, current meta-RL approaches rely on manually-defined distributions of training tasks, and hand-crafting these task distributions can be challenging and time-consuming. Can "useful" pre-training tasks be discovered in an unsupervised manner? We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. The task distribution is scaffolded by a parametric density model of the meta-learner's trajectory distribution. We formulate unsupervised meta-RL as information maximization between a latent task variable and the meta-learner's data distribution, and describe a practical instantiation which alternates between integration of recent experience into the task distribution and meta-learning of the updated tasks. Repeating this procedure leads to iterative reorganization such that the curriculum adapts as the meta-learner's data distribution shifts. In particular, we show how discriminative clustering for visual representation can support trajectory-level task acquisition and exploration in domains with pixel observations, avoiding pitfalls of alternatives. In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.

READ FULL TEXT

page 7

page 9

page 17

page 19

page 20

page 21

research
06/12/2018

Unsupervised Meta-Learning for Reinforcement Learning

Meta-learning is a powerful tool that builds on multi-task learning to l...
research
05/16/2019

Meta Reinforcement Learning with Task Embedding and Shared Policy

Despite significant progress, deep reinforcement learning (RL) suffers f...
research
04/07/2021

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Vision-based reinforcement learning (RL) is successful, but how to gener...
research
02/22/2021

Unsupervised Meta Learning for One Shot Title Compression in Voice Commerce

Product title compression for voice and mobile commerce is a well studie...
research
10/18/2021

Provable Hierarchy-Based Meta-Reinforcement Learning

Hierarchical reinforcement learning (HRL) has seen widespread interest a...
research
09/25/2019

Pre-training as Batch Meta Reinforcement Learning with tiMe

Pre-training is transformative in supervised learning: a large network t...
research
10/11/2022

Discovered Policy Optimisation

Tremendous progress has been made in reinforcement learning (RL) over th...

Please sign up or login with your details

Forgot password? Click here to reset