Playable Video Generation

01/28/2021
by   Willi Menapace, et al.
1

This paper introduces the unsupervised learning problem of playable video generation (PVG). In PVG, we aim at allowing a user to control the generated video by selecting a discrete action at every time step as when playing a video game. The difficulty of the task lies both in learning semantically consistent actions and in generating realistic videos conditioned on the user input. We propose a novel framework for PVG that is trained in a self-supervised manner on a large dataset of unlabelled videos. We employ an encoder-decoder architecture where the predicted action labels act as bottleneck. The network is constrained to learn a rich action space using, as main driving loss, a reconstruction loss on the generated video. We demonstrate the effectiveness of the proposed approach on several datasets with wide environment variety. Further details, code and examples are available on our project page willi-menapace.github.io/playable-video-generation-website.

READ FULL TEXT

page 1

page 4

page 8

research
10/21/2021

LARNet: Latent Action Representation for Human Action Synthesis

We present LARNet, a novel end-to-end approach for generating human acti...
research
03/03/2022

Playable Environments: Video Manipulation in Space and Time

We present Playable Environments - a new representation for interactive ...
research
11/23/2020

Action Concept Grounding Network for Semantically-Consistent Video Generation

Recent works in self-supervised video prediction have mainly focused on ...
research
04/14/2020

Unsupervised Multimodal Video-to-Video Translation via Self-Supervised Learning

Existing unsupervised video-to-video translation methods fail to produce...
research
10/13/2021

NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels

Deep learning has shown remarkable progress in a wide range of problems....
research
12/04/2018

Conditional Video Generation Using Action-Appearance Captions

The field of automatic video generation has received a boost thanks to t...
research
03/28/2020

Predicting the Popularity of Micro-videos with Multimodal Variational Encoder-Decoder Framework

As an emerging type of user-generated content, micro-video drastically e...

Please sign up or login with your details

Forgot password? Click here to reset