Controllable Video Generation through Global and Local Motion Dynamics

04/13/2022
by   Aram Davtyan, et al.
0

We present GLASS, a method for Global and Local Action-driven Sequence Synthesis. GLASS is a generative model that is trained on video sequences in an unsupervised manner and that can animate an input image at test time. The method learns to segment frames into foreground-background layers and to generate transitions of the foregrounds over time through a global and local action representation. Global actions are explicitly related to 2D shifts, while local actions are instead related to (both geometric and photometric) local deformations. GLASS uses a recurrent neural network to transition between frames and is trained through a reconstruction loss. We also introduce W-Sprites (Walking Sprites), a novel synthetic dataset with a predefined action space. We evaluate our method on both W-Sprites and real datasets, and find that GLASS is able to generate realistic video sequences from a single input image and to successfully learn a more advanced action space than in prior work.

READ FULL TEXT

page 2

page 3

page 8

page 10

page 15

page 17

page 18

page 19

research
12/05/2018

Video Synthesis from a Single Image and Motion Stroke

In this paper, we propose a new method to automatically generate a video...
research
12/10/2019

Learning to Discriminate Information for Online Action Detection

From a streaming video, online action detection aims to identify actions...
research
09/30/2019

Synthesizing Action Sequences for Modifying Model Decisions

When a model makes a consequential decision, e.g., denying someone a loa...
research
10/15/2017

Text2Action: Generative Adversarial Synthesis from Language to Action

In this paper, we propose a generative model which learns the relationsh...
research
01/27/2018

Image2GIF: Generating Cinemagraphs using Recurrent Deep Q-Networks

Given a still photograph, one can imagine how dynamic objects might move...
research
08/01/2022

Exploring the GLIDE model for Human Action-effect Prediction

We address the following action-effect prediction task. Given an image d...
research
07/24/2018

Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

We study the problem of synthesizing a number of likely future frames fr...

Please sign up or login with your details

Forgot password? Click here to reset