DeepAI AI Chat
Log In Sign Up

Self-Supervised Equivariant Scene Synthesis from Video

02/01/2021
by   Cinjon Resnick, et al.
NYU college
8

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into background, characters, and their animations. Our method capitalizes on moving characters being equivariant with respect to their transformation across frames and the background being constant with respect to that same transformation. After training, we can manipulate image encodings in real time to create unseen combinations of the delineated components. As far as we know, we are the first method to perform unsupervised extraction and synthesis of interpretable background, character, and animation. We demonstrate results on three datasets: Moving MNIST with backgrounds, 2D video game sprites, and Fashion Modeling.

READ FULL TEXT

page 2

page 6

page 8

page 11

page 12

page 14

page 15

11/11/2020

Learned Equivariant Rendering without Transformation Supervision

We propose a self-supervised framework to learn scene representations fr...
09/01/2022

SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches

2D animation is a common factor in game development, used for characters...
06/27/2018

CeMNet: Self-supervised learning for accurate continuous ego-motion estimation

In this paper, we propose a novel self-supervised learning model for est...
11/09/2012

NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR

In this paper we propose a robust approach for text extraction and recog...
11/10/2021

Self-Supervised Real-time Video Stabilization

Videos are a popular media form, where online video streaming has recent...
12/20/2017

Self-Supervised Damage-Avoiding Manipulation Strategy Optimization via Mental Simulation

Everyday robotics are challenged to deal with autonomous product handlin...