Object Goal Navigation with End-to-End Self-Supervision

12/09/2022
by   So Yeon Min, et al.
0

A household robot should be able to navigate to target locations without requiring users to first annotate everything in their home. Current approaches to this object navigation challenge do not test on real robots and rely on expensive semantically labeled 3D meshes. In this work, our aim is an agent that builds self-supervised models of the world via exploration, the same as a child might. We propose an end-to-end self-supervised embodied agent that leverages exploration to train a semantic segmentation model of 3D objects, and uses those representations to learn an object navigation policy purely from self-labeled 3D meshes. The key insight is that embodied agents can leverage location consistency as a supervision signal - collecting images from different views/angles and applying contrastive learning to fine-tune a semantic segmentation model. In our experiments, we observe that our framework performs better than other self-supervised baselines and competitively with supervised baselines, in both simulation and when deployed in real houses.

READ FULL TEXT

page 7

page 12

page 13

page 14

page 15

page 16

page 17

page 18

research
10/04/2022

Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene

The ability to endow maps of indoor scenes with semantic information is ...
research
02/03/2021

Fast Concept Mapping: The Emergence of Human Abilities in Artificial Neural Networks when Learning Embodied and Self-Supervised

Most artificial neural networks used for object detection and recognitio...
research
12/02/2021

SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency

In this paper, we explore how we can build upon the data and models of I...
research
06/30/2022

Visual Pre-training for Navigation: What Can We Learn from Noise?

A powerful paradigm for sensorimotor control is to predict actions from ...
research
08/26/2022

The Foreseeable Future: Self-Supervised Learning to Predict Dynamic Scenes for Indoor Navigation

We present a method for generating, predicting, and using Spatiotemporal...
research
07/14/2020

Explore and Explain: Self-supervised Navigation and Recounting

Embodied AI has been recently gaining attention as it aims to foster the...
research
06/21/2022

SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene Understanding

In order to operate in human environments, a robot's semantic perception...

Please sign up or login with your details

Forgot password? Click here to reset