A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

06/21/2022
by   Brandon Trabucco, et al.
0

Physically rearranging objects is an important capability for embodied agents. Visual room rearrangement evaluates an agent's ability to rearrange objects in a room to a desired goal based solely on visual input. We propose a simple yet effective method for this problem: (1) search for and map which objects need to be rearranged, and (2) rearrange each object until the task is complete. Our approach consists of an off-the-shelf semantic segmentation model, voxel-based semantic map, and semantic search policy to efficiently find objects that need to be rearranged. On the AI2-THOR Rearrangement Challenge, our method improves on current state-of-the-art end-to-end reinforcement learning-based methods that learn visual rearrangement policies from 0.53 correct rearrangement to 16.56 environment.

READ FULL TEXT

page 2

page 4

page 14

page 17

research
07/21/2022

TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors

We introduce TIDEE, an embodied agent that tidies up a disordered scene ...
research
01/15/2020

3D Object Segmentation for Shelf Bin Picking by Humanoid with Deep Learning and Occupancy Voxel Grid Map

Picking objects in a narrow space such as shelf bins is an important tas...
research
08/31/2021

SemIE: Semantically-aware Image Extrapolation

We propose a semantically-aware novel paradigm to perform image extrapol...
research
03/30/2021

Visual Room Rearrangement

There has been a significant recent progress in the field of Embodied AI...
research
03/02/2022

Vision-based Large-scale 3D Semantic Mapping for Autonomous Driving Applications

In this paper, we present a complete pipeline for 3D semantic mapping so...
research
09/21/2018

GAPLE: Generalizable Approaching Policy LEarning for Robotic Object Searching in Indoor Environment

We study the problem of learning a generalizable action policy for an in...
research
03/21/2021

MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation

Visual navigation for autonomous agents is a core task in the fields of ...

Please sign up or login with your details

Forgot password? Click here to reset