Learn the Force We Can: Multi-Object Video Generation from Pixel-Level Interactions

06/06/2023
by   Aram Davtyan, et al.
0

We propose a novel unsupervised method to autoregressively generate videos from a single frame and a sparse motion input. Our trained model can generate realistic object-to-object interactions and separate the dynamics and the extents of multiple objects despite only observing them under correlated motion activities. Key components in our method are the randomized conditioning scheme, the encoding of the input motion control, and the randomized and sparse sampling to break correlations. Our model, which we call YODA, has the ability to move objects without physically touching them. We show both qualitatively and quantitatively that YODA accurately follows the user control, while yielding a video quality that is on par with or better than state of the art video generation prior work on several datasets. For videos, visit our project website https://araachie.github.io/yoda.

READ FULL TEXT

page 1

page 4

page 7

page 8

page 10

page 11

research
08/19/2021

Click to Move: Controlling Video Generation with Sparse Motion

This paper introduces Click to Move (C2M), a novel framework for video g...
research
11/19/2021

Xp-GAN: Unsupervised Multi-object Controllable Video Generation

Video Generation is a relatively new and yet popular subject in machine ...
research
05/06/2023

Multi-object Video Generation from Single Frame Layouts

In this paper, we study video synthesis with emphasis on simplifying the...
research
12/21/2021

Continuous-Time Video Generation via Learning Motion Dynamics with Neural ODE

In order to perform unconditional video generation, we must learn the di...
research
06/19/2019

Unsupervised Learning of Object Structure and Dynamics from Videos

Extracting and predicting object structure and dynamics from videos with...
research
07/06/2021

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

How would a static scene react to a local poke? What are the effects on ...
research
09/29/2021

The Object at Hand: Automated Editing for Mixed Reality Video Guidance from Hand-Object Interactions

In this paper, we concern with the problem of how to automatically extra...

Please sign up or login with your details

Forgot password? Click here to reset