Modality-Aware Triplet Hard Mining for Zero-shot Sketch-Based Image Retrieval

12/15/2021
by   Zongheng Huang, et al.
0

This paper tackles the Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) problem from the viewpoint of cross-modality metric learning. This task has two characteristics: 1) the zero-shot setting requires a metric space with good within-class compactness and the between-class discrepancy for recognizing the novel classes and 2) the sketch query and the photo gallery are in different modalities. The metric learning viewpoint benefits ZS-SBIR from two aspects. First, it facilitates improvement through recent good practices in deep metric learning (DML). By combining two fundamental learning approaches in DML, e.g., classification training and pairwise training, we set up a strong baseline for ZS-SBIR. Without bells and whistles, this baseline achieves competitive retrieval accuracy. Second, it provides an insight that properly suppressing the modality gap is critical. To this end, we design a novel method named Modality-Aware Triplet Hard Mining (MATHM). MATHM enhances the baseline with three types of pairwise learning, e.g., a cross-modality sample pair, a within-modality sample pair, and their combination.also design an adaptive weighting method to balance these three components during training dynamically. Experimental results confirm that MATHM brings another round of significant improvement based on the strong baseline and sets up new state-of-the-art performance. For example, on the TU-Berlin dataset, we achieve 47.88+2.94 mAP@all and 58.28+2.34 https://github.com/huangzongheng/MATHM.

READ FULL TEXT
research
02/20/2023

Ontology-aware Network for Zero-shot Sketch-based Image Retrieval

Zero-Shot Sketch-Based Image Retrieval (ZSSBIR) is an emerging task. The...
research
06/22/2021

Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval

Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal ...
research
06/01/2023

Class Anchor Margin Loss for Content-Based Image Retrieval

The performance of neural networks in content-based image retrieval (CBI...
research
07/04/2018

Deep Cross-modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-based 3D Shape Retrieval

Due to the large cross-modality discrepancy between 2D sketches and 3D s...
research
06/19/2023

Renderers are Good Zero-Shot Representation Learners: Exploring Diffusion Latents for Metric Learning

Can the latent spaces of modern generative neural rendering models serve...
research
03/01/2019

A Sketch Based 3D Shape Retrieval Approach Based on Efficient Deep Point-to-Subspace Metric Learning

One key issue in managing a large scale 3D shape dataset is to identify ...
research
06/25/2020

Adaptive additive classification-based loss for deep metric learning

Recent works have shown that deep metric learning algorithms can benefit...

Please sign up or login with your details

Forgot password? Click here to reset