Referring Expression Comprehension: A Survey of Methods and Datasets

07/19/2020
by   Yanyuan Qiao, et al.
0

Referring expression comprehension (REC) aims to localize a target object in an image described by a referring expression phrased in natural language. Different from the object detection task that queried object labels have been pre-defined, the REC problem only can observe the queries during the test. It thus more challenging than a conventional computer vision problem. This task has attracted a lot of attention from both computer vision and natural language processing community, and several lines of work have been proposed, from CNN-RNN model, modular network to complex graph-based model. In this survey, we first examine the state of the art by comparing modern approaches to the problem. We classify methods by their mechanism to encode the visual and textual modalities. In particular, we examine the common approach of joint embedding images and expressions to a common feature space. We also discuss modular architectures and graph-based models that interface with structured graph representation. In the second part of this survey, we review the datasets available for training and evaluating REC systems. We then group results according to the datasets, backbone models, settings so that they can be fairly compared. Finally, we discuss promising future directions for the field, in particular the compositional referring expression comprehension that requires longer reasoning chain to address.

READ FULL TEXT

page 1

page 3

research
03/01/2020

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

Referring expression comprehension (REF) aims at identifying a particula...
research
11/17/2017

Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries

Recognising objects according to a pre-defined fixed set of class labels...
research
12/12/2018

Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

The task in referring expression comprehension is to localise the object...
research
07/31/2022

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

Referring Expression Comprehension (REC) is one of the most important ta...
research
03/18/2020

MUTATT: Visual-Textual Mutual Guidance for Referring Expression Comprehension

Referring expression comprehension (REC) aims to localize a text-related...
research
01/24/2018

MAttNet: Modular Attention Network for Referring Expression Comprehension

In this paper, we address referring expression comprehension: localizing...
research
09/16/2019

A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension

Referring expression comprehension aims to localize the object instance ...

Please sign up or login with your details

Forgot password? Click here to reset