Learning Joint Embedding for Cross-Modal Retrieval

08/21/2019
by   Donghuo Zeng, et al.
0

A cross-modal retrieval process is to use a query in one modality to obtain relevant data in another modality. The challenging issue of cross-modal retrieval lies in bridging the heterogeneous gap for similarity computation, which has been broadly discussed in image-text, audio-text, and video-text cross-modal multimedia data mining and retrieval. However, the gap in temporal structures of different data modalities is not well addressed due to the lack of alignment relationship between temporal cross-modal structures. Our research focuses on learning the correlation between different modalities for the task of cross-modal retrieval. We have proposed an architecture: Supervised-Deep Canonical Correlation Analysis (S-DCCA), for cross-modal retrieval. In this forum paper, we will talk about how to exploit triplet neural networks (TNN) to enhance the correlation learning for cross-modal retrieval. The experimental result shows the proposed TNN-based supervised correlation learning architecture can get the best result when the data representation extracted by supervised learning.

READ FULL TEXT
research
08/10/2019

Deep Triplet Neural Networks with Cluster-CCA for Audio-Visual Cross-modal Retrieval

Cross-modal retrieval aims to retrieve data in one modality by a query i...
research
03/29/2022

On Metric Learning for Audio-Text Cross-Modal Retrieval

Audio-text retrieval aims at retrieving a target audio clip or caption f...
research
07/19/2018

Revisiting Cross Modal Retrieval

This paper proposes a cross-modal retrieval system that leverages on ima...
research
12/05/2021

Variational Autoencoder with CCA for Audio-Visual Cross-Modal Retrieval

Cross-modal retrieval is to utilize one modality as a query to retrieve ...
research
11/24/2017

Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval

Little research focuses on cross-modal correlation learning where tempor...
research
04/28/2018

Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch

In this work we introduce a cross modal image retrieval system that allo...
research
11/28/2014

Cross-Modal Learning via Pairwise Constraints

In multimedia applications, the text and image components in a web docum...

Please sign up or login with your details

Forgot password? Click here to reset