Hierarchical Similarity Learning for Language-based Product Image Retrieval

02/18/2021
by   Zhe Ma, et al.
0

This paper aims for the language-based product image retrieval task. The majority of previous works have made significant progress by designing network structure, similarity measurement, and loss function. However, they typically perform vision-text matching at certain granularity regardless of the intrinsic multiple granularities of images. In this paper, we focus on the cross-modal similarity measurement, and propose a novel Hierarchical Similarity Learning (HSL) network. HSL first learns multi-level representations of input data by stacked encoders, and object-granularity similarity and image-granularity similarity are computed at each level. All the similarities are combined as the final hierarchical cross-modal similarity. Experiments on a large-scale product retrieval dataset demonstrate the effectiveness of our proposed method. Code and data are available at https://github.com/liufh1/hsl.

READ FULL TEXT
research
01/20/2022

Deep Unsupervised Contrastive Hashing for Large-Scale Cross-Modal Text-Image Retrieval in Remote Sensing

Due to the availability of large-scale multi-modal data (e.g., satellite...
research
09/18/2023

Unified Coarse-to-Fine Alignment for Video-Text Retrieval

The canonical approach to video-text retrieval leverages a coarse-graine...
research
08/08/2023

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval

Most existing cross-modal retrieval methods employ two-stream encoders w...
research
04/18/2022

OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval

Retrieving tracked-vehicles by natural language descriptions plays a cri...
research
05/07/2023

Cross-Modal Retrieval for Motion and Text via MildTriple Loss

Cross-modal retrieval has become a prominent research topic in computer ...
research
08/27/2023

Towards Fast and Accurate Image-Text Retrieval with Self-Supervised Fine-Grained Alignment

Image-text retrieval requires the system to bridge the heterogenous gap ...
research
04/04/2023

AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation

This paper presents the AToMiC (Authoring Tools for Multimedia Content) ...

Please sign up or login with your details

Forgot password? Click here to reset