Scene Graph based Image Retrieval – A case study on the CLEVR Dataset

11/03/2019
by   Sahana Ramnath, et al.
0

With the prolification of multimodal interaction in various domains, recently there has been much interest in text based image retrieval in the computer vision community. However most of the state of the art techniques model this problem in a purely neural way, which makes it difficult to incorporate pragmatic strategies in searching a large scale catalog especially when the search requirements are insufficient and the model needs to resort to an interactive retrieval process through multiple iterations of question-answering. Motivated by this, we propose a neural-symbolic approach for a one-shot retrieval of images from a large scale catalog, given the caption description. To facilitate this, we represent the catalog and caption as scene-graphs and model the retrieval task as a learnable graph matching problem, trained end-to-end with a REINFORCE algorithm. Further, we briefly describe an extension of this pipeline to an iterative retrieval framework, based on interactive questioning and answering.

READ FULL TEXT

page 1

page 2

page 3

research
03/02/2021

Part2Whole: Iteratively Enrich Detail for Cross-Modal Retrieval with Partial Query

Text-based image retrieval has seen considerable progress in recent year...
research
09/21/2021

Homography augumented momentum constrastive learning for SAR image retrieval

Deep learning-based image retrieval has been emphasized in computer visi...
research
01/01/2021

VisualSparta: Sparse Transformer Fragment-level Matching for Large-scale Text-to-Image Search

Text-to-image retrieval is an essential task in multi-modal information ...
research
12/04/2014

Reading Text in the Wild with Convolutional Neural Networks

In this work we present an end-to-end system for text spotting -- locali...
research
08/19/2019

Genetic Algorithms for the Optimization of Diffusion Parameters in Content-Based Image Retrieval

Several computer vision and artificial intelligence projects are nowaday...
research
09/10/2023

Duplicate Question Retrieval and Confirmation Time Prediction in Software Communities

Community Question Answering (CQA) in different domains is growing at a ...
research
03/27/2023

Model Cascades for Efficient Image Search

Modern neural encoders offer unprecedented text-image retrieval (TIR) ac...

Please sign up or login with your details

Forgot password? Click here to reset