Geometry-Aware Recurrent Neural Networks for Active Visual Recognition

by   Ricson Cheng, et al.

We present recurrent geometry-aware neural networks that integrate visual information across multiple views of a scene into 3D latent feature tensors, while maintaining an one-to-one mapping between 3D physical locations in the world scene and latent feature locations. Object detection, object segmentation, and 3D reconstruction is then carried out directly using the constructed 3D feature memory, as opposed to any of the input 2D images. The proposed models are equipped with differentiable egomotion-aware feature warping and (learned) depth-aware unprojection operations to achieve geometrically consistent mapping between the features in the input frame and the constructed latent model of the scene. We empirically show the proposed model generalizes much better than geometryunaware LSTM/GRU networks, especially under the presence of multiple objects and cross-object occlusions. Combined with active view selection policies, our model learns to select informative viewpoints to integrate information from by "undoing" cross-object occlusions, seamlessly combining geometry with learning from experience.



page 8


Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

We integrate two powerful ideas, geometry and deep visual representation...

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

We present BlockGAN, an image generative model that learns object-aware ...

Embodied View-Contrastive 3D Feature Learning

Humans can effortlessly imagine the occluded side of objects in a photog...

Geometry-Aware Video Object Detection for Static Cameras

In this paper we propose a geometry-aware model for video object detecti...

Object-Centric Image Generation with Factored Depths, Locations, and Appearances

We present a generative model of images that explicitly reasons over the...

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Localizing objects and estimating their extent in 3D is an important ste...

Multi-Agent Active Search using Realistic Depth-Aware Noise Model

The search for objects of interest in an unknown environment by making d...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.