Temporal Consistency Two-Stream CNN for Human Motion Prediction

04/11/2021
by   Jin Tang, et al.
0

Fusion is critical for a two-stream network. In this paper, we propose a novel temporal fusion (TF) module to fuse the two-stream joints' information to predict human motion, including a temporal concatenation and a reinforcement trajectory spatial-temporal (TST) block, specifically designed to keep prediction temporal consistency. In particular, the temporal concatenation keeps the temporal consistency of preliminary predictions from two streams. Meanwhile, the TST block improves the spatial-temporal feature coupling. However, the TF module can increase the temporal continuities between the first predicted pose and the given poses and between each predicted pose. The fusion is based on a two-stream network that consists of a dynamic velocity stream (V-Stream) and a static position stream (P-Stream) because we found that the joints' velocity information improves the short-term prediction, while the joints' position information is better at long-term prediction, and they are complementary in motion prediction. Finally, our approach achieves impressive results on three benchmark datasets, including H3.6M, CMU-Mocap, and 3DPW in both short-term and long-term predictions, confirming its effectiveness and efficiency.

READ FULL TEXT
research
05/08/2023

Towards Accurate Human Motion Prediction via Iterative Refinement

Human motion prediction aims to forecast an upcoming pose sequence given...
research
09/10/2016

Sequential Deep Trajectory Descriptor for Action Recognition with Three-stream CNN

Learning the spatial-temporal representation of motion information is cr...
research
02/19/2020

Three-Stream Fusion Network for First-Person Interaction Recognition

First-person interaction recognition is a challenging task because of un...
research
08/09/2021

Development of Human Motion Prediction Strategy using Inception Residual Block

Human Motion Prediction is a crucial task in computer vision and robotic...
research
12/03/2019

Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation

This paper presents a method for automatic video object segmentation bas...
research
12/03/2018

Spatial-temporal Fusion Convolutional Neural Network for Simulated Driving Behavior Recognition

Abnormal driving behaviour is one of the leading cause of terrible traff...
research
08/30/2020

Finding Action Tubes with a Sparse-to-Dense Framework

The task of spatial-temporal action detection has attracted increasing a...

Please sign up or login with your details

Forgot password? Click here to reset