Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

11/10/2019
by   Fuwen Tan, et al.
19

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex image scenes containing multiple objects. We propose Drill-down, an effective framework for encoding multiple queries with an efficient compact state representation that significantly extends current methods for single-round image retrieval. We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes. Furthermore, we find that existing image datasets with textual captions can provide a surprisingly effective form of weak supervision for this task. We compare our method with existing sequential encoding and embedding networks, demonstrating superior performance on two proposed benchmarks: automatic image retrieval on a simulated scenario that uses region captions as queries, and interactive image retrieval using real queries from human evaluators.

READ FULL TEXT

page 2

page 8

page 9

page 10

page 11

page 12

research
07/02/2017

Where to Play: Retrieval of Video Segments using Natural-Language Queries

In this paper, we propose a new approach for retrieval of video segments...
research
06/30/2011

Structured Knowledge Representation for Image Retrieval

We propose a structured approach to the problem of retrieval of images b...
research
03/17/2023

IRGen: Generative Modeling for Image Retrieval

While generative modeling has been ubiquitous in natural language proces...
research
02/09/2021

Telling the What while Pointing the Where: Fine-grained Mouse Trace and Language Supervision for Improved Image Retrieval

Existing image retrieval systems use text queries to provide a natural a...
research
07/30/2020

From A Glance to "Gotcha": Interactive Facial Image Retrieval with Progressive Relevance Feedback

Facial image retrieval plays a significant role in forensic investigatio...
research
06/12/2023

Sticker820K: Empowering Interactive Retrieval with Stickers

Stickers have become a ubiquitous part of modern-day communication, conv...
research
11/19/2014

A Pooling Approach to Modelling Spatial Relations for Image Retrieval and Annotation

Over the last two decades we have witnessed strong progress on modeling ...

Please sign up or login with your details

Forgot password? Click here to reset