Joint Multimodal Learning with Deep Generative Models

11/07/2016
by   Masahiro Suzuki, et al.
0

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such as variational autoencoders (VAEs). However, these models typically assume that modalities are forced to have a conditioned relation, i.e., we can only generate modalities in one direction. To achieve our objective, we should extract a joint representation that captures high-level concepts among all modalities and through which we can exchange them bi-directionally. As described herein, we propose a joint multimodal variational autoencoder (JMVAE), in which all modalities are independently conditioned on joint representation. In other words, it models a joint distribution of modalities. Furthermore, to be able to generate missing modalities from the remaining modalities properly, we develop an additional method, JMVAE-kl, that is trained by reducing the divergence between JMVAE's encoder and prepared networks of respective modalities. Our experiments show that our proposed method can obtain appropriate joint representation from multiple modalities and that it can generate and reconstruct them more properly than conventional VAEs. We further demonstrate that JMVAE can generate multiple modalities bi-directionally.

READ FULL TEXT

page 8

page 11

research
01/26/2018

Improving Bi-directional Generation between Different Modalities with Variational Autoencoders

We investigate deep generative models that can exchange multiple modalit...
research
11/08/2019

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Learning generative models that span multiple data modalities, such as v...
research
05/19/2023

Improving Multimodal Joint Variational Autoencoders through Normalizing Flows and Correlation Analysis

We propose a new multimodal variational autoencoder that enables to gene...
research
03/28/2023

Multimodal and multicontrast image fusion via deep generative models

Recently, it has become progressively more evident that classic diagnost...
research
07/01/2020

In-Distribution Interpretability for Challenging Modalities

It is widely recognized that the predictions of deep neural networks are...
research
12/11/2019

Multimodal Generative Models for Compositional Representation Learning

As deep neural networks become more adept at traditional tasks, many of ...
research
03/06/2016

Variational methods for Conditional Multimodal Deep Learning

In this paper, we address the problem of conditional modality learning, ...

Please sign up or login with your details

Forgot password? Click here to reset