DeepAI AI Chat
Log In Sign Up

SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency

by   Devendra Singh Chaplot, et al.

In this paper, we explore how we can build upon the data and models of Internet images and use them to adapt to robot vision without requiring any extra labels. We present a framework called Self-supervised Embodied Active Learning (SEAL). It utilizes perception models trained on internet images to learn an active exploration policy. The observations gathered by this exploration policy are labelled using 3D consistency and used to improve the perception model. We build and utilize 3D semantic maps to learn both action and perception in a completely self-supervised manner. The semantic map is used to compute an intrinsic motivation reward for training the exploration policy and for labelling the agent observations using spatio-temporal 3D consistency and label propagation. We demonstrate that the SEAL framework can be used to close the action-perception loop: it improves object detection and instance segmentation performance of a pretrained perception model by just moving around in training environments and the improved perception model can be used to improve Object Goal Navigation.


page 2

page 4

page 5

page 6

page 8


SEMI: Self-supervised Exploration via Multisensory Incongruity

Efficient exploration is a long-standing problem in reinforcement learni...

Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers

In this work, we consider the problem of learning a perception model for...

Self-Improving Semantic Perception on a Construction Robot

We propose a novel robotic system that can improve its semantic percepti...

Embodied Learning for Lifelong Visual Perception

We study lifelong visual perception in an embodied setup, where we devel...

Self-Supervised Exploration via Disagreement

Efficient exploration is a long-standing problem in sensorimotor learnin...

Active Perception with Neural Networks

Active perception has been employed in many domains, particularly in the...

Self-Calibrating Active Binocular Vision via Active Efficient Coding with Deep Autoencoders

We present a model of the self-calibration of active binocular vision co...