Cross-modal Common Representation Learning by Hybrid Transfer Network

06/01/2017
by   Xin Huang, et al.
0

DNN-based cross-modal retrieval is a research hotspot to retrieve across different modalities as image and text, but existing methods often face the challenge of insufficient cross-modal training data. In single-modal scenario, similar problem is usually relieved by transferring knowledge from large-scale auxiliary datasets (as ImageNet). Knowledge from such single-modal datasets is also very useful for cross-modal retrieval, which can provide rich general semantic information that can be shared across different modalities. However, it is challenging to transfer useful knowledge from single-modal (as image) source domain to cross-modal (as image/text) target domain. Knowledge in source domain cannot be directly transferred to both two different modalities in target domain, and the inherent cross-modal correlation contained in target domain provides key hints for cross-modal retrieval which should be preserved during transfer process. This paper proposes Cross-modal Hybrid Transfer Network (CHTN) with two subnetworks: Modal-sharing transfer subnetwork utilizes the modality in both source and target domains as a bridge, for transferring knowledge to both two modalities simultaneously; Layer-sharing correlation subnetwork preserves the inherent cross-modal semantic correlation to further adapt to cross-modal retrieval task. Cross-modal data can be converted to common representation by CHTN for retrieval, and comprehensive experiment on 3 datasets shows its effectiveness.

READ FULL TEXT
research
08/08/2017

MHTN: Modal-adversarial Hybrid Transfer Network for Cross-modal Retrieval

Cross-modal retrieval has drawn wide interest for retrieval across diffe...
research
07/21/2016

A Comprehensive Survey on Cross-modal Retrieval

In recent years, cross-modal retrieval has drawn much attention due to t...
research
05/19/2018

Do Neural Network Cross-Modal Mappings Really Bridge Modalities?

Feed-forward networks are widely used in cross-modal applications to bri...
research
04/17/2019

Adversarial Cross-Modal Retrieval via Learning and Transferring Single-Modal Similarities

Cross-modal retrieval aims to retrieve relevant data across different mo...
research
09/23/2022

Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval

As an increasingly popular task in multimedia information retrieval, vid...
research
04/14/2023

Cross-domain Food Image-to-Recipe Retrieval by Weighted Adversarial Learning

Food image-to-recipe aims to learn an embedded space linking the rich se...
research
03/10/2018

Deep Cross-media Knowledge Transfer

Cross-media retrieval is a research hotspot in multimedia area, which ai...

Please sign up or login with your details

Forgot password? Click here to reset