Masked World Models for Visual Control

06/28/2022
by   Younggyo Seo, et al.
0

Visual model-based reinforcement learning (RL) has the potential to enable sample-efficient robot learning from visual observations. Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects. In this work, we introduce a visual model-based RL framework that decouples visual representation learning and dynamics learning. Specifically, we train an autoencoder with convolutional layers and vision transformers (ViT) to reconstruct pixels given masked convolutional features, and learn a latent dynamics model that operates on the representations from the autoencoder. Moreover, to encode task-relevant information, we introduce an auxiliary reward prediction objective for the autoencoder. We continually update both autoencoder and dynamics model using online samples collected from environment interaction. We demonstrate that our decoupling approach achieves state-of-the-art performance on a variety of visual robotic tasks from Meta-world and RLBench, e.g., we achieve 81.7 success rate on 50 visual robotic manipulation tasks from Meta-world, while the baseline achieves 67.9 https://sites.google.com/view/mwm-rl.

READ FULL TEXT

page 3

page 8

page 15

page 21

research
04/18/2022

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Model-based reinforcement learning (RL) algorithms designed for handling...
research
09/10/2020

Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

Predictive models have been at the core of many robotic systems, from qu...
research
08/31/2023

RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability

Visual model-based RL methods typically encode image observations into l...
research
03/07/2023

End-to-End Deep Visual Control for Mastering Needle-Picking Skills With World Models and Behavior Cloning

Needle picking is a challenging surgical task in robot-assisted surgery ...
research
09/01/2022

Transformers are Sample Efficient World Models

Deep reinforcement learning agents are notoriously sample inefficient, w...
research
08/28/2018

SOLAR: Deep Structured Latent Representations for Model-Based Reinforcement Learning

Model-based reinforcement learning (RL) methods can be broadly categoriz...
research
06/14/2023

VIBR: Learning View-Invariant Value Functions for Robust Visual Control

End-to-end reinforcement learning on images showed significant progress ...

Please sign up or login with your details

Forgot password? Click here to reset