Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding

09/05/2019
by   Xuejing Liu, et al.
10

Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage. In referring expressions, people usually describe a target entity in terms of its relationship with other contextual entities as well as visual attributes. However, previous weakly supervised REG methods rarely pay attention to the relationship between the entities. In this paper, we propose a knowledge-guided pairwise reconstruction network (KPRN), which models the relationship between the target entity (subject) and contextual entity (object) as well as grounds these two entities. Specifically, we first design a knowledge extraction module to guide the proposal selection of subject and object. The prior knowledge is obtained in a specific form of semantic similarities between each proposal and the subject/object. Second, guided by such knowledge, we design the subject and object attention module to construct the subject-object proposal pairs. The subject attention excludes the unrelated proposals from the candidate proposals. The object attention selects the most suitable proposal as the contextual proposal. Third, we introduce a pairwise attention and an adaptive weighting scheme to learn the correspondence between these proposal pairs and the query. Finally, a pairwise reconstruction module is used to measure the grounding for weakly supervised learning. Extensive experiments on four large-scale datasets show our method outperforms existing state-of-the-art methods by a large margin.

READ FULL TEXT

page 2

page 3

page 8

research
08/28/2019

Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

Weakly supervised referring expression grounding aims at localizing the ...
research
02/22/2023

Focusing On Targets For Improving Weakly Supervised Visual Grounding

Weakly supervised visual grounding aims to predict the region in an imag...
research
04/17/2020

CPARR: Category-based Proposal Analysis for Referring Relationships

The task of referring relationships is to localize subject and object en...
research
07/18/2022

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

Weakly supervised Referring Expression Grounding (REG) aims to ground a ...
research
08/03/2022

Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation

Recently, increasing efforts have been focused on Weakly Supervised Scen...
research
03/24/2021

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

Visual grounding, which aims to build a correspondence between visual ob...
research
03/16/2023

LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding

Humans excel at acquiring knowledge through observation. For example, we...

Please sign up or login with your details

Forgot password? Click here to reset