PIRC Net : Using Proposal Indexing, Relationships and Context for Phrase Grounding

12/07/2018
by   Rama Kovvuri, et al.
0

Phrase Grounding aims to detect and localize objects in images that are referred to and are queried by natural language phrases. Phrase grounding finds applications in tasks such as Visual Dialog, Visual Search and Image-text co-reference resolution. In this paper, we present a framework that leverages information such as phrase category, relationships among neighboring phrases in a sentence and context to improve the performance of phrase grounding systems. We propose three modules: Proposal Indexing Network(PIN); Inter-phrase Regression Network(IRN) and Proposal Ranking Network(PRN) each of which analyze the region proposals of an image at increasing levels of detail by incorporating the above information. Also, in the absence of ground-truth spatial locations of the phrases(weakly-supervised), we propose knowledge transfer mechanisms that leverages the framework of PIN module. We demonstrate the effectiveness of our approach on the Flickr 30k Entities and ReferItGame datasets, for which we achieve improvements over state-of-the-art approaches in both supervised and weakly-supervised variants.

READ FULL TEXT

page 2

page 5

page 14

research
11/12/2015

Grounding of Textual Phrases in Images by Reconstruction

Grounding (i.e. localizing) arbitrary, free-form textual phrases in visu...
research
06/06/2020

MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level

Grounding free-form textual queries necessitates an understanding of the...
research
07/03/2020

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Weakly supervised phrase grounding aims at learning region-phrase corres...
research
03/11/2018

Knowledge Aided Consistency for Weakly Supervised Phrase Grounding

Given a natural language query, a phrase grounding system aims to locali...
research
08/04/2017

Query-guided Regression Network with Context Policy for Phrase Grounding

Given a textual description of an image, phrase grounding localizes obje...
research
05/03/2017

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures

We propose a weakly-supervised approach that takes image-sentence pairs ...
research
05/01/2018

Weakly Supervised Attention Learning for Textual Phrases Grounding

Grounding textual phrases in visual content is a meaningful yet challeng...

Please sign up or login with your details

Forgot password? Click here to reset