Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening

03/14/2023
by   Min Cao, et al.
0

Under the flourishing development in performance, current image-text retrieval methods suffer from N-related time complexity, which hinders their application in practice. Targeting at efficiency improvement, this paper presents a simple and effective keyword-guided pre-screening framework for the image-text retrieval. Specifically, we convert the image and text data into the keywords and perform the keyword matching across modalities to exclude a large number of irrelevant gallery samples prior to the retrieval network. For the keyword prediction, we transfer it into a multi-label classification problem and propose a multi-task learning scheme by appending the multi-label classifiers to the image-text retrieval network to achieve a lightweight and high-performance keyword prediction. For the keyword matching, we introduce the inverted index in the search engine and create a win-win situation on both time and space complexities for the pre-screening. Extensive experiments on two widely-used datasets, i.e., Flickr30K and MS-COCO, verify the effectiveness of the proposed framework. The proposed framework equipped with only two embedding layers achieves O(1) querying time complexity, while improving the retrieval efficiency and keeping its performance, when applied prior to the common image-text retrieval methods. Our code will be released.

READ FULL TEXT

page 1

page 2

page 10

research
05/26/2021

Quotient Space-Based Keyword Retrieval in Sponsored Search

Synonymous keyword retrieval has become an important problem for sponsor...
research
11/23/2022

Embedding Compression for Text Classification Using Dictionary Screening

In this paper, we propose a dictionary screening method for embedding co...
research
05/27/2019

Dynamically Visual Disambiguation of Keyword-based Image Search

Due to the high cost of manual annotation, learning directly from the we...
research
01/03/2020

A Multi-oriented Chinese Keyword Spotter Guided by Text Line Detection

Chinese keyword spotting is a challenging task as there is no visual bla...
research
08/18/2017

Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding

Retrieval of text information from natural scene images and video frames...
research
05/11/2022

Multi-Label Logo Recognition and Retrieval based on Weighted Fusion of Neural Features

Logo classification is a particular case of image classification, since ...
research
03/01/2023

The style transformer with common knowledge optimization for image-text retrieval

Image-text retrieval which associates different modalities has drawn bro...

Please sign up or login with your details

Forgot password? Click here to reset