Geometry-Aware Recurrent Neural Networks for Active Visual Recognition

11/03/2018
by   Ricson Cheng, et al.
0

We present recurrent geometry-aware neural networks that integrate visual information across multiple views of a scene into 3D latent feature tensors, while maintaining an one-to-one mapping between 3D physical locations in the world scene and latent feature locations. Object detection, object segmentation, and 3D reconstruction is then carried out directly using the constructed 3D feature memory, as opposed to any of the input 2D images. The proposed models are equipped with differentiable egomotion-aware feature warping and (learned) depth-aware unprojection operations to achieve geometrically consistent mapping between the features in the input frame and the constructed latent model of the scene. We empirically show the proposed model generalizes much better than geometryunaware LSTM/GRU networks, especially under the presence of multiple objects and cross-object occlusions. Combined with active view selection policies, our model learns to select informative viewpoints to integrate information from by "undoing" cross-object occlusions, seamlessly combining geometry with learning from experience.

READ FULL TEXT
research
12/31/2018

Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

We integrate two powerful ideas, geometry and deep visual representation...
research
02/20/2020

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

We present BlockGAN, an image generative model that learns object-aware ...
research
06/10/2019

Embodied View-Contrastive 3D Feature Learning

Humans can effortlessly imagine the occluded side of objects in a photog...
research
09/06/2019

Geometry-Aware Video Object Detection for Static Cameras

In this paper we propose a geometry-aware model for video object detecti...
research
08/22/2023

Affordance segmentation of hand-occluded containers from exocentric images

Visual affordance segmentation identifies the surfaces of an object an a...
research
11/06/2020

Disentangling 3D Prototypical Networks For Few-Shot Concept Learning

We present neural architectures that disentangle RGB-D images into objec...
research
08/21/2018

Deep Learned Full-3D Object Completion from Single View

3D geometry is a very informative cue when interacting with and navigati...

Please sign up or login with your details

Forgot password? Click here to reset