Geometry-Aware Recurrent Neural Networks for Active Visual Recognition

11/03/2018
by   Ricson Cheng, et al.
0

We present recurrent geometry-aware neural networks that integrate visual information across multiple views of a scene into 3D latent feature tensors, while maintaining an one-to-one mapping between 3D physical locations in the world scene and latent feature locations. Object detection, object segmentation, and 3D reconstruction is then carried out directly using the constructed 3D feature memory, as opposed to any of the input 2D images. The proposed models are equipped with differentiable egomotion-aware feature warping and (learned) depth-aware unprojection operations to achieve geometrically consistent mapping between the features in the input frame and the constructed latent model of the scene. We empirically show the proposed model generalizes much better than geometryunaware LSTM/GRU networks, especially under the presence of multiple objects and cross-object occlusions. Combined with active view selection policies, our model learns to select informative viewpoints to integrate information from by "undoing" cross-object occlusions, seamlessly combining geometry with learning from experience.

READ FULL TEXT

Authors

page 8

12/31/2018

Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

We integrate two powerful ideas, geometry and deep visual representation...
02/20/2020

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

We present BlockGAN, an image generative model that learns object-aware ...
06/10/2019

Embodied View-Contrastive 3D Feature Learning

Humans can effortlessly imagine the occluded side of objects in a photog...
09/06/2019

Geometry-Aware Video Object Detection for Static Cameras

In this paper we propose a geometry-aware model for video object detecti...
04/01/2020

Object-Centric Image Generation with Factored Depths, Locations, and Appearances

We present a generative model of images that explicitly reasons over the...
08/23/2021

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Localizing objects and estimating their extent in 3D is an important ste...
11/09/2020

Multi-Agent Active Search using Realistic Depth-Aware Noise Model

The search for objects of interest in an unknown environment by making d...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.