Multimodal Generative Models for Scalable Weakly-Supervised Learning

02/14/2018
by   Mike Wu, et al.
0

Multiple modalities often co-occur when describing natural phenomena. Learning a joint representation of these modalities should yield deeper and more useful representations. Previous work have proposed generative models to handle multi-modal input. However, these models either do not learn a joint distribution or require complex additional computations to handle missing data. Here, we introduce a multimodal variational autoencoder that uses a product-of-experts inference network and a sub-sampled training paradigm to solve the multi-modal inference problem. Notably, our model shares parameters to efficiently learn under any combination of missing modalities, thereby enabling weakly-supervised learning. We apply our method on four datasets and show that we match state-of-the-art performance using many fewer parameters. In each case our approach yields strong weakly-supervised results. We then consider a case study of learning image transformations---edge detection, colorization, facial landmark segmentation, etc.---as a set of modalities. We find appealing results across this range of tasks.

READ FULL TEXT

page 6

page 8

research
11/08/2019

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Learning generative models that span multiple data modalities, such as v...
research
09/08/2020

Learning more expressive joint distributions in multimodal variational methods

Data often are formed of multiple modalities, which jointly describe the...
research
10/29/2019

Model enhancement and personalization using weakly supervised learning for multi-modal mobile sensing

Always-on sensing of mobile device user's contextual information is crit...
research
10/08/2021

On the Limitations of Multimodal VAEs

Multimodal variational autoencoders (VAEs) have shown promise as efficie...
research
06/10/2022

Image Generation with Multimodal Priors using Denoising Diffusion Probabilistic Models

Image synthesis under multi-modal priors is a useful and challenging tas...
research
12/23/2020

Private-Shared Disentangled Multimodal VAE for Learning of Hybrid Latent Representations

Multi-modal generative models represent an important family of deep mode...
research
07/06/2018

Deep Multiple Instance Feature Learning via Variational Autoencoder

We describe a novel weakly supervised deep learning framework that combi...

Please sign up or login with your details

Forgot password? Click here to reset