Private-Shared Disentangled Multimodal VAE for Learning of Hybrid Latent Representations

12/23/2020
by   Mihee Lee, et al.
9

Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or modalities. However, current deep multi-modal models focus on the inference of shared representations, while neglecting the important private aspects of data within individual modalities. In this paper, we introduce a disentangled multi-modal variational autoencoder (DMVAE) that utilizes disentangled VAE strategy to separate the private and shared latent spaces of multiple modalities. We specifically consider the instance where the latent factor may be of both continuous and discrete nature, leading to the family of general hybrid DMVAE models. We demonstrate the utility of DMVAE on a semi-supervised learning task, where one of the modalities contains partial data labels, both relevant and irrelevant to the other modality. Our experiments on several benchmarks indicate the importance of the private-shared disentanglement as well as the hybrid latent representation.

READ FULL TEXT

page 7

page 8

page 15

page 16

page 17

page 18

page 19

research
11/08/2019

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Learning generative models that span multiple data modalities, such as v...
research
10/16/2012

Factorized Multi-Modal Topic Model

Multi-modal data collections, such as corpora of paired images and text ...
research
01/27/2021

Learning Abstract Representations through Lossy Compression of Multi-Modal Signals

A key competence for open-ended learning is the formation of increasingl...
research
02/14/2018

Multimodal Generative Models for Scalable Weakly-Supervised Learning

Multiple modalities often co-occur when describing natural phenomena. Le...
research
05/29/2018

Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data

Multimodal sensory data resembles the form of information perceived by h...
research
11/16/2017

Deep Matching Autoencoders

Increasingly many real world tasks involve data in multiple modalities o...
research
01/12/2023

Multimodal Deep Learning

This book is the result of a seminar in which we reviewed multimodal app...

Please sign up or login with your details

Forgot password? Click here to reset