Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video

03/16/2022
by   Wen-Li Wei, et al.
0

Learning to capture human motion is essential to 3D human pose and shape estimation from monocular video. However, the existing methods mainly rely on recurrent or convolutional operation to model such temporal information, which limits the ability to capture non-local context relations of human motion. To address this problem, we propose a motion pose and shape network (MPS-Net) to effectively capture humans in motion to estimate accurate and temporally coherent 3D human pose and shape from a video. Specifically, we first propose a motion continuity attention (MoCA) module that leverages visual cues observed from human motion to adaptively recalibrate the range that needs attention in the sequence to better capture the motion continuity dependencies. Then, we develop a hierarchical attentive feature integration (HAFI) module to effectively combine adjacent past and future feature representations to strengthen temporal correlation and refine the feature representation of the current frame. By coupling the MoCA and HAFI modules, the proposed MPS-Net excels in estimating 3D human pose and shape in the video. Though conceptually simple, our MPS-Net not only outperforms the state-of-the-art methods on the 3DPW, MPI-INF-3DHP, and Human3.6M benchmark datasets, but also uses fewer network parameters. The video demos can be found at https://mps-net.github.io/MPS-Net/.

READ FULL TEXT

page 1

page 2

page 8

page 12

research
04/29/2023

TAPE: Temporal Attention-based Probabilistic human pose and shape Estimation

Reconstructing 3D human pose and shape from monocular videos is a well-s...
research
11/17/2020

Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video

Despite the recent success of single image-based 3D human pose and shape...
research
12/11/2019

VIBE: Video Inference for Human Body Pose and Shape Estimation

Human motion is fundamental to understanding behavior. Despite progress ...
research
10/01/2014

Coupling Top-down and Bottom-up Methods for 3D Human Pose and Shape Estimation from Monocular Image Sequences

Until recently Intelligence, Surveillance, and Reconnaissance (ISR) focu...
research
07/25/2022

Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation

3D Human body pose and shape estimation within a temporal sequence can b...
research
12/20/2020

High-Fidelity Neural Human Motion Transfer from Monocular Video

Video-based human motion transfer creates video animations of humans fol...
research
07/15/2023

Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer

Video-based human pose transfer is a video-to-video generation task that...

Please sign up or login with your details

Forgot password? Click here to reset