DeepAI AI Chat
Log In Sign Up

Referring Expression Comprehension Using Language Adaptive Inference

by   Wei Su, et al.
Zhejiang University

Different from universal object detection, referring expression comprehension (REC) aims to locate specific objects referred to by natural language expressions. The expression provides high-level concepts of relevant visual and contextual patterns, which vary significantly with different expressions and account for only a few of those encoded in the REC model. This leads us to a question: do we really need the entire network with a fixed structure for various referring expressions? Ideally, given an expression, only expression-relevant components of the REC model are required. These components should be small in number as each expression only contains very few visual and contextual clues. This paper explores the adaptation between expressions and REC models for dynamic inference. Concretely, we propose a neat yet efficient framework named Language Adaptive Dynamic Subnets (LADS), which can extract language-adaptive subnets from the REC model conditioned on the referring expressions. By using the compact subnet, the inference can be more economical and efficient. Extensive experiments on RefCOCO, RefCOCO+, RefCOCOg, and Referit show that the proposed method achieves faster inference speed and higher accuracy against state-of-the-art approaches.


page 3

page 7


Dynamic Graph Attention for Referring Expression Comprehension

Referring expression comprehension aims to locate the object instance de...

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

Referring Expression Comprehension (REC) is one of the most important ta...

Language Adaptive Weight Generation for Multi-task Visual Grounding

Although the impressive performance in visual grounding, the prevailing ...

InDEX: Indonesian Idiom and Expression Dataset for Cloze Test

We propose InDEX, an Indonesian Idiom and Expression dataset for cloze t...

Extending Functional Languages with High-Level Exception Handling

We extend functional languages with high-level exception handling. To be...

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge

Conventional referring expression comprehension (REF) assumes people to ...

MAttNet: Modular Attention Network for Referring Expression Comprehension

In this paper, we address referring expression comprehension: localizing...