Object Manipulation via Visual Target Localization

03/15/2022
by   Lucas Taylor, et al.
0

Object manipulation is a critical skill required for Embodied AI agents interacting with the world around them. Training agents to manipulate objects, poses many challenges. These include occlusion of the target object by the agent's arm, noisy object detection and localization, and the target frequently going out of view as the agent moves around in the scene. We propose Manipulation via Visual Object Location Estimation (m-VOLE), an approach that explores the environment in search for target objects, computes their 3D coordinates once they are located, and then continues to estimate their 3D locations even when the objects are not visible, thus robustly aiding the task of manipulating these objects throughout the episode. Our evaluations show a massive 3x improvement in success rate over a model that has access to the same sensory suite but is trained without the object location estimator, and our analysis shows that our agent is robust to noise in depth perception and agent localization. Importantly, our proposed approach relaxes several assumptions about idealized localization and perception that are commonly employed by recent works in embodied AI – an important step towards training agents for object manipulation in the real world.

READ FULL TEXT

page 1

page 7

page 13

research
04/22/2021

ManipulaTHOR: A Framework for Visual Object Manipulation

The domain of Embodied AI has recently witnessed substantial progress, p...
research
10/02/2021

Mobile Manipulation Leveraging Multiple Views

While both navigation and manipulation are challenging topics in isolati...
research
03/16/2023

FindView: Precise Target View Localization Task for Look Around Agents

With the increase in demands for service robots and automated inspection...
research
11/06/2020

Occlusion-Aware Search for Object Retrieval in Clutter

We address the manipulation task of retrieving a target object from a cl...
research
03/23/2020

Learning Object Permanence from Video

Object Permanence allows people to reason about the location of non-visi...
research
10/19/2015

PERCH: Perception via Search for Multi-Object Recognition and Localization

In many robotic domains such as flexible automated manufacturing or pers...
research
06/23/2020

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects

We revisit the problem of Object-Goal Navigation (ObjectNav). In its sim...

Please sign up or login with your details

Forgot password? Click here to reset