Learning 3D Human Dynamics from Video

12/04/2018
by   Angjoo Kanazawa, et al.
0

From an image of a person in action, we can easily guess the 3D motion of the person in the immediate past and future. This is because we have a mental model of 3D human dynamics that we have acquired from observing visual sequences of humans in motion. We present a framework that can similarly learn a representation of 3D dynamics of humans from video via a simple but effective temporal encoding of image features. At test time, from video, the learned temporal representation can recover smooth 3D mesh predictions. From a single image, our model can recover the current 3D mesh as well as its 3D past and future motion. Our approach is designed so it can learn from videos with 2D pose annotations in a semi-supervised manner. However, annotated data is always limited. On the other hand, there are millions of videos uploaded daily on the Internet. In this work, we harvest this Internet-scale source of unlabeled data by training our model on them with pseudo-ground truth 2D pose obtained from an off-the-shelf 2D pose detector. Our experiments show that adding more videos with pseudo-ground truth 2D pose monotonically improves 3D prediction performance. We evaluate our model on the recent challenging dataset of 3D Poses in the Wild and obtain state-of-the-art performance on the 3D prediction task without any fine-tuning. The project website with video can be found at https://akanazawa.github.io/human_dynamics/.

READ FULL TEXT

page 2

page 7

research
08/13/2019

Predicting 3D Human Dynamics from Video

Given a video of a person in action, we can easily guess the 3D future m...
research
12/17/2020

Human Mesh Recovery from Multiple Shots

Videos from edited media like movies are a useful, yet under-explored so...
research
08/13/2020

Full-Body Awareness from Partial Observations

There has been great progress in human 3D mesh recovery and great intere...
research
12/23/2020

Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild

Given an "in-the-wild" video of a person, we reconstruct an animatable m...
research
10/18/2021

Leveraging MoCap Data for Human Mesh Recovery

Training state-of-the-art models for human body pose and shape recovery ...
research
06/21/2022

Domain Adaptive 3D Pose Augmentation for In-the-wild Human Mesh Recovery

The ability to perceive 3D human bodies from a single image has a multit...
research
11/10/2020

Simple means Faster: Real-Time Human Motion Forecasting in Monocular First Person Videos on CPU

We present a simple, fast, and light-weight RNN based framework for fore...

Please sign up or login with your details

Forgot password? Click here to reset