Query-guided Regression Network with Context Policy for Phrase Grounding

08/04/2017
by   Kan Chen, et al.
0

Given a textual description of an image, phrase grounding localizes objects in the image referred by query phrases in the description. State-of-the-art methods address the problem by ranking a set of proposals based on the relevance to each query, which are limited by the performance of independent proposal generation systems and ignore useful cues from context in the description. In this paper, we adopt a spatial regression method to break the performance limit, and introduce reinforcement learning techniques to further leverage semantic context information. We propose a novel Query-guided Regression network with Context policy (QRC Net) which jointly learns a Proposal Generation Network (PGN), a Query-guided Regression Network (QRN) and a Context Policy Network (CPN). Experiments show QRC Net provides a significant improvement in accuracy on two popular datasets: Flickr30K Entities and Referit Game, with 14.25

READ FULL TEXT

page 1

page 3

page 7

research
12/07/2018

PIRC Net : Using Proposal Indexing, Relationships and Context for Phrase Grounding

Phrase Grounding aims to detect and localize objects in images that are ...
research
06/06/2020

MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level

Grounding free-form textual queries necessitates an understanding of the...
research
03/11/2018

Knowledge Aided Consistency for Weakly Supervised Phrase Grounding

Given a natural language query, a phrase grounding system aims to locali...
research
12/20/2020

PPGN: Phrase-Guided Proposal Generation Network For Referring Expression Comprehension

Reference expression comprehension (REC) aims to find the location that ...
research
05/09/2018

Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding

Visual grounding aims to localize an object in an image referred to by a...
research
04/12/2022

Position-aware Location Regression Network for Temporal Video Grounding

The key to successful grounding for video surveillance is to understand ...
research
03/09/2021

Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning

In this paper, we are tackling the proposal-free referring expression gr...

Please sign up or login with your details

Forgot password? Click here to reset