Deep Matching Autoencoders

11/16/2017
by   Tanmoy Mukherjee, et al.
0

Increasingly many real world tasks involve data in multiple modalities or views. This has motivated the development of many effective algorithms for learning a common latent space to relate multiple domains. However, most existing cross-view learning algorithms assume access to paired data for training. Their applicability is thus limited as the paired data assumption is often violated in practice: many tasks have only a small subset of data available with pairing annotation, or even no paired data at all. In this paper we introduce Deep Matching Autoencoders (DMAE), which learn a common latent space and pairing from unpaired multi-modal data. Specifically we formulate this as a cross-domain representation learning and object matching problem. We simultaneously optimise parameters of representation learning auto-encoders and the pairing of unpaired multi-modal data. This framework elegantly spans the full regime from fully supervised, semi-supervised, and unsupervised (no paired data) multi-modal learning. We show promising results in image captioning, and on a new task that is uniquely enabled by our methodology: unsupervised classifier learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

Multi-modal Latent Diffusion

Multi-modal data-sets are ubiquitous in modern applications, and multi-m...
research
12/20/2013

Learning Paired-associate Images with An Unsupervised Deep Learning Architecture

This paper presents an unsupervised multi-modal learning system that lea...
research
04/21/2018

Multi-modal space structure: a new kind of latent correlation for multi-modal entity resolution

Multi-modal data is becoming more common than before because of big data...
research
12/23/2020

Private-Shared Disentangled Multimodal VAE for Learning of Hybrid Latent Representations

Multi-modal generative models represent an important family of deep mode...
research
12/23/2016

DeMIAN: Deep Modality Invariant Adversarial Network

Obtaining common representations from different modalities is important ...
research
03/17/2017

Learning Robust Visual-Semantic Embeddings

Many of the existing methods for learning joint embedding of images and ...
research
05/17/2021

Synthesising Multi-Modal Minority Samples for Tabular Data

Real-world binary classification tasks are in many cases imbalanced, whe...

Please sign up or login with your details

Forgot password? Click here to reset