Whether you can locate or not? Interactive Referring Expression Generation

08/19/2023
by   Fulong Ye, et al.
0

Referring Expression Generation (REG) aims to generate unambiguous Referring Expressions (REs) for objects in a visual scene, with a dual task of Referring Expression Comprehension (REC) to locate the referred object. Existing methods construct REG models independently by using only the REs as ground truth for model training, without considering the potential interaction between REG and REC models. In this paper, we propose an Interactive REG (IREG) model that can interact with a real REC model, utilizing signals indicating whether the object is located and the visual region located by the REC model to gradually modify REs. Our experimental results on three RE benchmark datasets, RefCOCO, RefCOCO+, and RefCOCOg show that IREG outperforms previous state-of-the-art methods on popular evaluation metrics. Furthermore, a human evaluation shows that IREG generates better REs with the capability of interaction.

READ FULL TEXT

page 2

page 4

page 7

page 8

page 10

research
03/30/2021

Locate then Segment: A Strong Pipeline for Referring Image Segmentation

Referring image segmentation aims to segment the objects referred by a n...
research
09/18/2019

Dynamic Graph Attention for Referring Expression Comprehension

Referring expression comprehension aims to locate the object instance de...
research
01/12/2017

Comprehension-guided referring expressions

We consider generation and comprehension of natural language referring e...
research
11/29/2018

Towards Human-Friendly Referring Expression Generation

This paper addresses the generation of referring expressions that not on...
research
10/24/2022

Towards Unifying Reference Expression Generation and Comprehension

Reference Expression Generation (REG) and Comprehension (REC) are two hi...
research
03/19/2020

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

Referring expression comprehension (REC) and segmentation (RES) are two ...
research
05/30/2023

DisCLIP: Open-Vocabulary Referring Expression Generation

Referring Expressions Generation (REG) aims to produce textual descripti...

Please sign up or login with your details

Forgot password? Click here to reset