T6D-Direct: Transformers for Multi-Object 6D Pose Direct Regression

09/22/2021
by   Arash Amini, et al.
2

6D pose estimation is the task of predicting the translation and orientation of objects in a given input image, which is a crucial prerequisite for many robotics and augmented reality applications. Lately, the Transformer Network architecture, equipped with a multi-head self-attention mechanism, is emerging to achieve state-of-the-art results in many computer vision tasks. DETR, a Transformer-based model, formulated object detection as a set prediction problem and achieved impressive results without standard components like region of interest pooling, non-maximal suppression, and bounding box proposals. In this work, we propose T6D-Direct, a real-time single-stage direct method with a transformer-based architecture built on DETR to perform 6D multi-object pose direct estimation. We evaluate the performance of our method on the YCB-Video dataset. Our method achieves the fastest inference time, and the pose estimation accuracy is comparable to state-of-the-art methods.

READ FULL TEXT

page 10

page 11

page 12

research
05/05/2022

YOLOPose: Transformer-based Multi-Object 6D Pose Estimation using Keypoint Regression

6D object pose estimation is a crucial prerequisite for autonomous robot...
research
07/21/2023

YOLOPose V2: Understanding and Improving Transformer-based 6D Pose Estimation

6D object pose estimation is a crucial prerequisite for autonomous robot...
research
03/29/2021

TFPose: Direct Human Pose Estimation with Transformers

We propose a human pose estimation framework that solves the task in the...
research
12/22/2020

A Structure-Aware Method for Direct Pose Estimation

Estimating camera pose from a single image is a fundamental problem in c...
research
12/03/2021

Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer

Recent developments in transformer models for visual data have led to si...
research
07/06/2021

Automatic size and pose homogenization with spatial transformer network to improve and accelerate pediatric segmentation

Due to a high heterogeneity in pose and size and to a limited number of ...
research
11/20/2022

A Lightweight Domain Adaptive Absolute Pose Regressor Using Barlow Twins Objective

Identifying the camera pose for a given image is a challenging problem w...

Please sign up or login with your details

Forgot password? Click here to reset