Natural Language Object Retrieval

11/13/2015
by   Ronghang Hu, et al.
0

In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object. Natural language object retrieval differs from text-based image retrieval task as it involves spatial information about objects within the scene and global scene context. To address this issue, we propose a novel Spatial Context Recurrent ConvNet (SCRC) model as scoring function on candidate boxes for object retrieval, integrating spatial configurations and global scene-level contextual information into the network. Our model processes query text, local image descriptors, spatial configurations and global context features through a recurrent network, outputs the probability of the query text conditioned on each candidate box as a score for the box, and can transfer visual-linguistic knowledge from image captioning domain to our task. Experimental results demonstrate that our method effectively utilizes both local and global information, outperforming previous baseline methods significantly on different datasets and scenarios, and can exploit large scale vision and language datasets for knowledge transfer.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 9

page 10

page 11

page 12

research
03/16/2018

Object Captioning and Retrieval with Natural Language

We address the problem of jointly learning vision and language to unders...
research
08/27/2018

Single Shot Scene Text Retrieval

Textual information found in scene images provides high level semantic i...
research
04/04/2019

VQD: Visual Query Detection in Natural Scenes

We propose Visual Query Detection (VQD), a new visual grounding task. In...
research
06/23/2020

Robot Object Retrieval with Contextual Natural Language Queries

Natural language object retrieval is a highly useful yet challenging tas...
research
03/22/2017

An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning

We propose an end-to-end approach to the natural language object retriev...
research
05/24/2017

Attention-based Natural Language Person Retrieval

Following the recent progress in image classification and captioning usi...
research
03/20/2016

Segmentation from Natural Language Expressions

In this paper we approach the novel problem of segmenting an image based...

Please sign up or login with your details

Forgot password? Click here to reset