CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning

10/14/2017
by   Yuxin Peng, et al.
0

It is known that the inconsistent distribution and representation of different modalities, such as image and text, cause the heterogeneity gap that makes it challenging to correlate such heterogeneous data. Generative adversarial networks (GANs) have shown its strong ability of modeling data distribution and learning discriminative representation, existing GANs-based works mainly focus on generative problem to generate new data. We have different goal, aim to correlate heterogeneous data, by utilizing the power of GANs to model cross-modal joint distribution. Thus, we propose Cross-modal GANs to learn discriminative common representation for bridging heterogeneity gap. The main contributions are: (1) Cross-modal GANs architecture is proposed to model joint distribution over data of different modalities. The inter-modality and intra-modality correlation can be explored simultaneously in generative and discriminative models. Both of them beat each other to promote cross-modal correlation learning. (2) Cross-modal convolutional autoencoders with weight-sharing constraint are proposed to form generative model. They can not only exploit cross-modal correlation for learning common representation, but also preserve reconstruction information for capturing semantic consistency within each modality. (3) Cross-modal adversarial mechanism is proposed, which utilizes two kinds of discriminative models to simultaneously conduct intra-modality and inter-modality discrimination. They can mutually boost to make common representation more discriminative by adversarial training process. To the best of our knowledge, our proposed CM-GANs approach is the first to utilize GANs to perform cross-modal common representation learning. Experiments are conducted to verify the performance of our proposed approach on cross-modal retrieval paradigm, compared with 10 methods on 3 cross-modal datasets.

READ FULL TEXT

page 1

page 4

page 8

page 11

research
12/01/2017

Unsupervised Generative Adversarial Cross-modal Hashing

Cross-modal hashing aims to map heterogeneous multimedia data into a com...
research
04/02/2018

SyncGAN: Synchronize the Latent Space of Cross-modal Generative Adversarial Networks

Generative adversarial network (GAN) has achieved impressive success on ...
research
03/25/2021

Discriminative Semantic Transitive Consistency for Cross-Modal Learning

Cross-modal retrieval is generally performed by projecting and aligning ...
research
07/19/2020

Symbiotic Adversarial Learning for Attribute-based Person Search

Attribute-based person search is in significant demand for applications ...
research
06/08/2018

Hierarchy of GANs for learning embodied self-awareness model

In recent years several architectures have been proposed to learn embodi...
research
03/26/2019

Cross-modal subspace learning with Kernel correlation maximization and Discriminative structure preserving

The measure between heterogeneous data is still an open problem. Many re...
research
04/11/2021

Integrating Information Theory and Adversarial Learning for Cross-modal Retrieval

Accurately matching visual and textual data in cross-modal retrieval has...

Please sign up or login with your details

Forgot password? Click here to reset