Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos

11/29/2022
by   Kyung-Min Jin, et al.
0

Previous video-based human pose estimation methods have shown promising results by leveraging aggregated features of consecutive frames. However, most approaches compromise accuracy to mitigate jitter or do not sufficiently comprehend the temporal aspects of human motion. Furthermore, occlusion increases uncertainty between consecutive frames, which results in unsmooth results. To address these issues, we design an architecture that exploits the keypoint kinematic features with the following components. First, we effectively capture the temporal features by leveraging individual keypoint's velocity and acceleration. Second, the proposed hierarchical transformer encoder aggregates spatio-temporal dependencies and refines the 2D or 3D input pose estimated from existing estimators. Finally, we provide an online cross-supervision between the refined input pose generated from the encoder and the final pose from our decoder to enable joint optimization. We demonstrate comprehensive results and validate the effectiveness of our model in various tasks: 2D pose estimation, 3D pose estimation, body mesh recovery, and sparsely annotated multi-human pose estimation. Our code is available at https://github.com/KyungMinJin/HANet.

READ FULL TEXT

page 3

page 5

page 6

page 7

research
07/20/2022

OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos

Although many approaches for multi-human pose estimation in videos have ...
research
02/22/2020

Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose Estimation

We propose a new deep learning network that introduces a deeper CNN chan...
research
09/06/2021

Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

3D human shape and pose estimation is the essential task for human motio...
research
10/12/2022

MotionBERT: Unified Pretraining for Human Motion Analysis

We present MotionBERT, a unified pretraining framework, to tackle differ...
research
12/27/2021

SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos

When analyzing human motion videos, the output jitters from existing pos...
research
08/25/2022

FusePose: IMU-Vision Sensor Fusion in Kinematic Space for Parametric Human Pose Estimation

There exist challenging problems in 3D human pose estimation mission, su...
research
12/22/2021

Improved 2D Keypoint Detection in Out-of-Balance and Fall Situations – combining input rotations and a kinematic model

Injury analysis may be one of the most beneficial applications of deep l...

Please sign up or login with your details

Forgot password? Click here to reset