Target-Guided Composed Image Retrieval

09/04/2023
by   Haokun Wen, et al.
0

Composed image retrieval (CIR) is a new and flexible image retrieval paradigm, which can retrieve the target image for a multimodal query, including a reference image and its corresponding modification text. Although existing efforts have achieved compelling success, they overlook the conflict relationship modeling between the reference image and the modification text for improving the multimodal query composition and the adaptive matching degree modeling for promoting the ranking of the candidate images that could present different levels of matching degrees with the given query. To address these two limitations, in this work, we propose a Target-Guided Composed Image Retrieval network (TG-CIR). In particular, TG-CIR first extracts the unified global and local attribute features for the reference/target image and the modification text with the contrastive language-image pre-training model (CLIP) as the backbone, where an orthogonal regularization is introduced to promote the independence among the attribute features. Then TG-CIR designs a target-query relationship-guided multimodal query composition module, comprising a target-free student composition branch and a target-based teacher composition branch, where the target-query relationship is injected into the teacher branch for guiding the conflict relationship modeling of the student branch. Last, apart from the conventional batch-based classification loss, TG-CIR additionally introduces a batch-based target similarity-guided matching degree regularization to promote the metric learning process. Extensive experiments on three benchmark datasets demonstrate the superiority of our proposed method.

READ FULL TEXT
research
05/17/2023

Self-Training Boosted Multi-Faceted Matching Network for Composed Image Retrieval

The composed image retrieval (CIR) task aims to retrieve the desired tar...
research
03/29/2023

Bi-directional Training for Composed Image Retrieval via Text Prompt Learning

Composed image retrieval searches for a target image based on a multi-mo...
research
06/19/2020

Compositional Learning of Image-Text Query for Image Retrieval

In this paper, we investigate the problem of retrieving images from a da...
research
09/05/2023

Dual Relation Alignment for Composed Image Retrieval

Composed image retrieval, a task involving the search for a target image...
research
06/03/2023

Relieving Triplet Ambiguity: Consensus Network for Language-Guided Image Retrieval

Language-guided image retrieval enables users to search for images and i...
research
05/25/2023

Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder

Composed image retrieval aims to find an image that best matches a given...
research
03/15/2022

ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity

An intuitive way to search for images is to use queries composed of an e...

Please sign up or login with your details

Forgot password? Click here to reset