ConvFormer: Parameter Reduction in Transformer Models for 3D Human Pose Estimation by Leveraging Dynamic Multi-Headed Convolutional Attention

04/04/2023
by   Alec Diaz-Arias, et al.
0

Recently, fully-transformer architectures have replaced the defacto convolutional architecture for the 3D human pose estimation task. In this paper we propose ConvFormer, a novel convolutional transformer that leverages a new dynamic multi-headed convolutional self-attention mechanism for monocular 3D human pose estimation. We designed a spatial and temporal convolutional transformer to comprehensively model human joint relations within individual frames and globally across the motion sequence. Moreover, we introduce a novel notion of temporal joints profile for our temporal ConvFormer that fuses complete temporal information immediately for a local neighborhood of joint features. We have quantitatively and qualitatively validated our method on three common benchmark datasets: Human3.6M, MPI-INF-3DHP, and HumanEva. Extensive experiments have been conducted to identify the optimal hyper-parameter set. These experiments demonstrated that we achieved a significant parameter reduction relative to prior transformer models while attaining State-of-the-Art (SOTA) or near SOTA on all three datasets. Additionally, we achieved SOTA for Protocol III on H36M for both GT and CPN detection inputs. Finally, we obtained SOTA on all three metrics for the MPI-INF-3DHP dataset and for all three subjects on HumanEva under Protocol II.

READ FULL TEXT

page 3

page 6

page 11

page 12

research
03/29/2021

3D Human Pose Estimation with Spatial and Temporal Transformers

Transformer architectures have become the model of choice in natural lan...
research
03/24/2022

CrossFormer: Cross Spatio-Temporal Transformer for 3D Human Pose Estimation

3D human pose estimation can be handled by encoding the geometric depend...
research
02/15/2023

Pose-Oriented Transformer with Uncertainty-Guided Refinement for 2D-to-3D Human Pose Estimation

There has been a recent surge of interest in introducing transformers to...
research
12/16/2018

Human Pose and Path Estimation from Aerial Video using Dynamic Classifier Selection

We consider the problem of estimating human pose and trajectory by an ae...
research
10/08/2022

(Fusionformer):Exploiting the Joint Motion Synergy with Fusion Network Based On Transformer for 3D Human Pose Estimation

For the current 3D human pose estimation task, in order to improve the e...
research
03/26/2021

Lifting Transformer for 3D Human Pose Estimation in Video

Despite great progress in video-based 3D human pose estimation, it is st...
research
08/07/2022

Jointformer: Single-Frame Lifting Transformer with Error Prediction and Refinement for 3D Human Pose Estimation

Monocular 3D human pose estimation technologies have the potential to gr...

Please sign up or login with your details

Forgot password? Click here to reset