Depth-based 6DoF Object Pose Estimation using Swin Transformer

03/03/2023
by   Zhujun Li, et al.
0

Accurately estimating the 6D pose of objects is crucial for many applications, such as robotic grasping, autonomous driving, and augmented reality. However, this task becomes more challenging in poor lighting conditions or when dealing with textureless objects. To address this issue, depth images are becoming an increasingly popular choice due to their invariance to a scene's appearance and the implicit incorporation of essential geometric characteristics. However, fully leveraging depth information to improve the performance of pose estimation remains a difficult and under-investigated problem. To tackle this challenge, we propose a novel framework called SwinDePose, that uses only geometric information from depth images to achieve accurate 6D pose estimation. SwinDePose first calculates the angles between each normal vector defined in a depth image and the three coordinate axes in the camera coordinate system. The resulting angles are then formed into an image, which is encoded using Swin Transformer. Additionally, we apply RandLA-Net to learn the representations from point clouds. The resulting image and point clouds embeddings are concatenated and fed into a semantic segmentation module and a 3D keypoints localization module. Finally, we estimate 6D poses using a least-square fitting approach based on the target object's predicted semantic mask and 3D keypoints. In experiments on the LineMod and Occlusion LineMod datasets, SwinDePose outperforms existing state-of-the-art methods for 6D object pose estimation using depth images. This demonstrates the effectiveness of our approach and highlights its potential for improving performance in real-world scenarios. Our code is at https://github.com/zhujunli1993/SwinDePose.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
11/25/2022

PoET: Pose Estimation Transformer for Single-View, Multi-Object 6D Pose Estimation

Accurate 6D object pose estimation is an important task for a variety of...
research
10/19/2020

SHREC 2020 track: 6D Object Pose Estimation

6D pose estimation is crucial for augmented reality, virtual reality, ro...
research
05/13/2018

DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map

For applications such as autonomous driving, self-localization/camera po...
research
05/07/2022

BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose Estimation

The challenges of learning a robust 6D pose function lie in 1) severe oc...
research
02/03/2020

YOLOff: You Only Learn Offsets for robust 6DoF object pose estimation

Estimating the 3D translation and orientation of an object is a challeng...
research
10/10/2021

6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-based Instance Representation Learning

This paper presents 6D-ViT, a transformer-based instance representation ...
research
12/27/2019

One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation

We propose a single-shot method for simultaneous 3D object segmentation ...

Please sign up or login with your details

Forgot password? Click here to reset