Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos

01/26/2020
by   Aysegul Dundar, et al.
25

Unsupervised landmark learning is the task of learning semantic keypoint-like representations without the use of expensive input keypoint-level annotations. A popular approach is to factorize an image into a pose and appearance data stream, then to reconstruct the image from the factorized components. The pose representation should capture a set of consistent and tightly localized landmarks in order to facilitate reconstruction of the input image. Ultimately, we wish for our learned landmarks to focus on the foreground object of interest. However, the reconstruction task of the entire image forces the model to allocate landmarks to model the background. This work explores the effects of factorizing the reconstruction task into separate foreground and background reconstructions, conditioning only the foreground reconstruction on the unsupervised landmarks. Our experiments demonstrate that the proposed factorization results in landmarks that are focused on the foreground object of interest. Furthermore, the rendered background quality is also improved, as the background rendering pipeline no longer requires the ill-suited landmarks to model its pose and appearance. We demonstrate this improvement in the context of the video-prediction task.

READ FULL TEXT

page 4

page 10

page 11

page 16

page 17

page 18

page 20

research
07/03/2019

Learning Landmarks from Unaligned Data using Image Translation

We introduce a method for learning landmark detectors from unlabelled vi...
research
06/20/2018

Conditional Image Generation for Learning the Structure of Visual Objects

In this paper, we consider the problem of learning landmarks for object ...
research
12/07/2017

Disentangled Person Image Generation

Generating novel, yet realistic, images of persons is a challenging task...
research
06/29/2020

Unsupervised Landmark Learning from Unpaired Data

Recent attempts for unsupervised landmark learning leverage synthesized ...
research
12/05/2016

ROAM: a Rich Object Appearance Model with Application to Rotoscoping

Rotoscoping, the detailed delineation of scene elements through a video ...
research
10/03/2020

A simulation environment for drone cinematography

In this paper, we present a workflow for the simulation of drone operati...
research
04/24/2020

Neural Head Reenactment with Latent Pose Descriptors

We propose a neural head reenactment system, which is driven by a latent...

Please sign up or login with your details

Forgot password? Click here to reset