Disentangling Representations using Gaussian Processes in Variational Autoencoders for Video Prediction

01/08/2020
by   Sarthak Bhagat, et al.
26

We introduce MGP-VAE, a variational autoencoder which uses Gaussian processes (GP) to model the latent space distribution. We employ MGP-VAE for the unsupervised learning of video sequences to obtain disentangled representations. Previous work in this area has mainly been confined to separating dynamic information from static content. We improve on previous results by establishing a framework by which multiple features, static or dynamic, can be disentangled. Specifically we use fractional Brownian motions (fBM) and Brownian bridges (BB) to enforce an inter-frame correlation structure in each independent channel. We show that varying this correlation structure enables one to capture different aspects of variation in the data. We demonstrate the quality of our disentangled representations on numerous experiments on three publicly available datasets, and also perform quantitative tests on a video prediction task. In addition, we introduce a novel geodesic loss function which takes into account the curvature of the data manifold to improve learning in the prediction task. Our experiments show quantitatively that the combination of our improved disentangled representations with the novel loss function enable MGP-VAE to outperform the state-of-the-art in video prediction.

READ FULL TEXT

page 6

page 7

page 8

research
09/05/2019

Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations

Recently there has been an increased interest in unsupervised learning o...
research
06/07/2019

Disentangled State Space Representations

Sequential data often originates from diverse domains across which stati...
research
07/26/2023

Learning Disentangled Discrete Representations

Recent successes in image generation, model-based reinforcement learning...
research
01/19/2021

Disentangled Recurrent Wasserstein Autoencoder

Learning disentangled representations leads to interpretable models and ...
research
12/10/2018

Disentangled Dynamic Representations from Unordered Data

We present a deep generative model that learns disentangled static and d...
research
05/31/2017

Unsupervised Learning of Disentangled Representations from Video

We present a new model DrNET that learns disentangled image representati...
research
02/16/2018

Disentangling by Factorising

We define and address the problem of unsupervised learning of disentangl...

Please sign up or login with your details

Forgot password? Click here to reset