Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation

03/29/2022
by   Zhenguang Liu, et al.
2

Multi-frame human pose estimation has long been a compelling and fundamental problem in computer vision. This task is challenging due to fast motion and pose occlusion that frequently occur in videos. State-of-the-art methods strive to incorporate additional visual evidences from neighboring frames (supporting frames) to facilitate the pose estimation of the current frame (key frame). One aspect that has been obviated so far, is the fact that current methods directly aggregate unaligned contexts across frames. The spatial-misalignment between pose features of the current frame and neighboring frames might lead to unsatisfactory results. More importantly, existing approaches build upon the straightforward pose estimation loss, which unfortunately cannot constrain the network to fully leverage useful information from neighboring frames. To tackle these problems, we present a novel hierarchical alignment framework, which leverages coarse-to-fine deformations to progressively update a neighboring frame to align with the current frame at the feature level. We further propose to explicitly supervise the knowledge extraction from neighboring frames, guaranteeing that useful complementary cues are extracted. To achieve this goal, we theoretically analyzed the mutual information between the frames and arrived at a loss that maximizes the task-relevant mutual information. These allow us to rank No.1 in the Multi-frame Person Pose Estimation Challenge on benchmark dataset PoseTrack2017, and obtain state-of-the-art performance on benchmarks Sub-JHMDB and Pose-Track2018. Our code is released at https://github. com/Pose-Group/FAMI-Pose, hoping that it will be useful to the community.

READ FULL TEXT

page 1

page 4

page 7

page 8

research
03/15/2023

Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

Temporal modeling is crucial for multi-frame human pose estimation. Most...
research
09/13/2023

Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization

The self-media era provides us tremendous high quality videos. Unfortuna...
research
03/12/2021

Deep Dual Consecutive Network for Human Pose Estimation

Multi-frame human pose estimation in complicated situations is challengi...
research
08/22/2019

Trajectory Space Factorization for Deep Video-Based 3D Human Pose Estimation

Existing deep learning approaches on 3d human pose estimation for videos...
research
08/18/2023

ResQ: Residual Quantization for Video Perception

This paper accelerates video perception, such as semantic segmentation a...
research
08/17/2018

Performance Analysis and Robustification of Single-query 6-DoF Camera Pose Estimation

We consider a single-query 6-DoF camera pose estimation with reference i...
research
12/23/2020

Blur More To Deblur Better: Multi-Blur2Deblur For Efficient Video Deblurring

One of the key components for video deblurring is how to exploit neighbo...

Please sign up or login with your details

Forgot password? Click here to reset