Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation

10/22/2021
by   Ziwen Li, et al.
0

Several video-based 3D pose and shape estimation algorithms have been proposed to resolve the temporal inconsistency of single-image-based methods. However it still remains challenging to have stable and accurate reconstruction. In this paper, we propose a new framework Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation (DTS-VIBE), to generate 3D human pose and mesh from RGB videos. We reformulate the task as a multi-modality problem that fuses RGB and optical flow for more reliable estimation. In order to fully utilize both sensory modalities (RGB or optical flow), we train a two-stream temporal network based on transformer to predict SMPL parameters. The supplementary modality, optical flow, helps to maintain temporal consistency by leveraging motion knowledge between two consecutive frames. The proposed algorithm is extensively evaluated on the Human3.6 and 3DPW datasets. The experimental results show that it outperforms other state-of-the-art methods by a significant margin.

READ FULL TEXT

page 1

page 3

page 6

page 7

page 8

research
08/20/2023

Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video

Despite significant progress in single image-based 3D human mesh recover...
research
12/10/2019

Flow-Distilled IP Two-Stream Networks for Compressed Video ActionRecognition

Two-stream networks have achieved great success in video recognition. A ...
research
06/09/2015

Flowing ConvNets for Human Pose Estimation in Videos

The objective of this work is human pose estimation in videos, where mul...
research
11/17/2018

Explicit Pose Deformation Learning for Tracking Human Poses

We present a method for human pose tracking that learns explicitly about...
research
11/17/2020

Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video

Despite the recent success of single image-based 3D human pose and shape...
research
11/06/2021

ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking

6D object pose tracking has been extensively studied in the robotics and...
research
11/04/2020

Mutual Modality Learning for Video Action Classification

The construction of models for video action classification progresses ra...

Please sign up or login with your details

Forgot password? Click here to reset