Cross-modal Deep Metric Learning with Multi-task Regularization

03/21/2017
by   Xin Huang, et al.
0

DNN-based cross-modal retrieval has become a research hotspot, by which users can search results across various modalities like image and text. However, existing methods mainly focus on the pairwise correlation and reconstruction error of labeled data. They ignore the semantically similar and dissimilar constraints between different modalities, and cannot take advantage of unlabeled data. This paper proposes Cross-modal Deep Metric Learning with Multi-task Regularization (CDMLMR), which integrates quadruplet ranking loss and semi-supervised contrastive loss for modeling cross-modal semantic similarity in a unified multi-task learning architecture. The quadruplet ranking loss can model the semantically similar and dissimilar constraints to preserve cross-modal relative similarity ranking information. The semi-supervised contrastive loss is able to maximize the semantic similarity on both labeled and unlabeled data. Compared to the existing methods, CDMLMR exploits not only the similarity ranking information but also unlabeled cross-modal data, and thus boosts cross-modal retrieval accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2014

Cross-Modal Learning via Pairwise Constraints

In multimedia applications, the text and image components in a web docum...
research
04/14/2017

Cross-media Similarity Metric Learning with Unified Deep Networks

As a highlighting research topic in the multimedia area, cross-media ret...
research
04/07/2017

CCL: Cross-modal Correlation Learning with Multi-grained Fusion by Hierarchical Network

Cross-modal retrieval has become a highlighted research topic for retrie...
research
05/27/2019

Label Prediction Framework for Semi-Supervised Cross-Modal Retrieval

Cross-modal data matching refers to retrieval of data from one modality,...
research
09/06/2023

FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests

Nowadays, many people frequently have to search for new accommodation op...
research
01/23/2019

"Is this an example image?" -- Predicting the Relative Abstractness Level of Image and Text

Successful multimodal search and retrieval requires the automatic unders...
research
10/23/2020

Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization

Matching information across image and text modalities is a fundamental c...

Please sign up or login with your details

Forgot password? Click here to reset