SO(3)-Pose: SO(3)-Equivariance Learning for 6D Object Pose Estimation

by   Haoran Pan, et al.

6D pose estimation of rigid objects from RGB-D images is crucial for object grasping and manipulation in robotics. Although RGB channels and the depth (D) channel are often complementary, providing respectively the appearance and geometry information, it is still non-trivial how to fully benefit from the two cross-modal data. From the simple yet new observation, when an object rotates, its semantic label is invariant to the pose while its keypoint offset direction is variant to the pose. To this end, we present SO(3)-Pose, a new representation learning network to explore SO(3)-equivariant and SO(3)-invariant features from the depth channel for pose estimation. The SO(3)-invariant features facilitate to learn more distinctive representations for segmenting objects with similar appearance from RGB channels. The SO(3)-equivariant features communicate with RGB features to deduce the (missed) geometry for detecting keypoints of an object with the reflective surface from the depth channel. Unlike most of existing pose estimation methods, our SO(3)-Pose not only implements the information communication between the RGB and depth channels, but also naturally absorbs the SO(3)-equivariance geometry knowledge from depth images, leading to better appearance and geometry representation learning. Comprehensive experiments show that our method achieves the state-of-the-art performance on three benchmarks.


page 2

page 4

page 8

page 9


RGB-based 3D Hand Pose Estimation via Privileged Learning with Depth Images

This paper proposes a method for hand pose estimation from RGB images th...

6D Pose Estimation with Correlation Fusion

6D object pose estimation is widely applied in robotic tasks such as gra...

The Best of Both Worlds: Learning Geometry-based 6D Object Pose Estimation

We address the task of estimating the 6D pose of known rigid objects, fr...

Exploring Intermediate Representation for Monocular Vehicle Pose Estimation

We present a new learning-based approach to recover egocentric 3D vehicl...

3D Pose Estimation and 3D Model Retrieval for Objects in the Wild

We propose a scalable, efficient and accurate approach to retrieve 3D mo...

6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-based Instance Representation Learning

This paper presents 6D-ViT, a transformer-based instance representation ...

Sparse Pose Trajectory Completion

We propose a method to learn, even using a dataset where objects appear ...

Please sign up or login with your details

Forgot password? Click here to reset