S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation

05/23/2020
by   Yizhe Zhu, et al.
31

We propose a sequential variational autoencoder to learn disentangled representations of sequential data (e.g., videos and audios) under self-supervision. Specifically, we exploit the benefits of some readily accessible supervisory signals from input data itself or some off-the-shelf functional models and accordingly design auxiliary tasks for our model to utilize these signals. With the supervision of the signals, our model can easily disentangle the representation of an input sequence into static factors and dynamic factors (i.e., time-invariant and time-varying parts). Comprehensive experiments across videos and audios verify the effectiveness of our model on representation disentanglement and generation of sequential data, and demonstrate that, our model with self-supervision performs comparable to, if not better than, the fully-supervised model with ground truth labels, and outperforms state-of-the-art unsupervised models by a large margin.

READ FULL TEXT

page 6

page 7

page 14

page 15

page 16

page 17

page 18

page 19

research
10/22/2021

Contrastively Disentangled Sequential Variational Autoencoder

Self-supervised disentangled representation learning is a critical task ...
research
01/19/2021

Disentangled Recurrent Wasserstein Autoencoder

Learning disentangled representations leads to interpretable models and ...
research
03/30/2023

Multifactor Sequential Disentanglement via Structured Koopman Autoencoders

Disentangling complex data to its latent factors of variation is a funda...
research
10/22/2020

A Framework for Contrastive and Generative Learning of Audio Representations

In this paper, we present a framework for contrastive learning for audio...
research
05/24/2017

Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations

We would like to learn a representation of the data which decomposes an ...
research
03/09/2021

Self-Supervision by Prediction for Object Discovery in Videos

Despite their irresistible success, deep learning algorithms still heavi...
research
09/03/2022

Equivariant Self-Supervision for Musical Tempo Estimation

Self-supervised methods have emerged as a promising avenue for represent...

Please sign up or login with your details

Forgot password? Click here to reset