Learning to Forecast and Refine Residual Motion for Image-to-Video Generation

07/26/2018
by   Long Zhao, et al.
4

We consider the problem of image-to-video translation, where an input image is translated into an output video containing motions of a single object. Recent methods for such problems typically train transformation networks to generate future frames conditioned on the structure sequence. Parallel work has shown that short high-quality motions can be generated by spatiotemporal generative networks that leverage temporal knowledge from the training data. We combine the benefits of both approaches and propose a two-stage generation framework where videos are generated from structures and then refined by temporal signals. To model motions more efficiently, we train networks to learn residual motion between the current and future frames, which avoids learning motion-irrelevant details. We conduct extensive experiments on two image-to-video translation tasks: facial expression retargeting and human pose forecasting. Superior results over the state-of-the-art methods on both tasks demonstrate the effectiveness of our approach.

READ FULL TEXT

page 4

page 5

page 11

page 14

research
08/09/2018

Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation

The recent advances in deep learning have made it possible to generate p...
research
11/23/2017

Deep Video Generation, Prediction and Completion of Human Action Sequences

Current deep learning results on video generation are limited while ther...
research
04/17/2023

Text2Performer: Text-Driven Human Video Generation

Text-driven content creation has evolved to be a transformative techniqu...
research
08/13/2022

A new way of video compression via forward-referencing using deep learning

To exploit high temporal correlations in video frames of the same scene,...
research
06/07/2019

Ego-Pose Estimation and Forecasting as Real-Time PD Control

We propose the use of a proportional-derivative (PD) control based polic...
research
07/24/2018

Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

We study the problem of synthesizing a number of likely future frames fr...
research
06/27/2021

Robust Pose Transfer with Dynamic Details using Neural Video Rendering

Pose transfer of human videos aims to generate a high fidelity video of ...

Please sign up or login with your details

Forgot password? Click here to reset