Equivariant Single View Pose Prediction Via Induced and Restricted Representations

07/07/2023
by   Owen Howell, et al.
0

Learning about the three-dimensional world from two-dimensional images is a fundamental problem in computer vision. An ideal neural network architecture for such tasks would leverage the fact that objects can be rotated and translated in three dimensions to make predictions about novel images. However, imposing SO(3)-equivariance on two-dimensional inputs is difficult because the group of three-dimensional rotations does not have a natural action on the two-dimensional plane. Specifically, it is possible that an element of SO(3) will rotate an image out of plane. We show that an algorithm that learns a three-dimensional representation of the world from two dimensional images must satisfy certain geometric consistency properties which we formulate as SO(2)-equivariance constraints. We use the induced and restricted representations of SO(2) on SO(3) to construct and classify architectures which satisfy these geometric consistency constraints. We prove that any architecture which respects said consistency constraints can be realized as an instance of our construction. We show that three previously proposed neural architectures for 3D pose prediction are special cases of our construction. We propose a new algorithm that is a learnable generalization of previously considered methods. We test our architecture on three pose predictions task and achieve SOTA results on both the PASCAL3D+ and SYMSOL pose estimation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2020

AutoSNAP: Automatically Learning Neural Architectures for Instrument Pose Estimation

Despite recent successes, the advances in Deep Learning have not yet bee...
research
12/18/2009

Matching 2-D Ellipses to 3-D Circles with Application to Vehicle Pose Estimation

Finding the three-dimensional representation of all or a part of a scene...
research
12/20/2019

DeepSFM: Structure From Motion Via Deep Bundle Adjustment

Structure from motion (SfM) is an essential computer vision problem whic...
research
12/24/2014

Transformation Properties of Learned Visual Representations

When a three-dimensional object moves relative to an observer, a change ...
research
11/16/2018

Ground Plane Polling for 6DoF Pose Estimation of Objects on the Road

This paper introduces an approach to produce accurate 3D detection boxes...
research
05/31/2019

Representation Theoretic Patterns in Multi-Frequency Class Averaging for Three-Dimensional Cryo-Electron Microscopy

We develop in this paper a novel intrinsic classification algorithm -- m...

Please sign up or login with your details

Forgot password? Click here to reset