VQD: Visual Query Detection in Natural Scenes

04/04/2019
by   Manoj Acharya, et al.
6

We propose Visual Query Detection (VQD), a new visual grounding task. In VQD, a system is guided by natural language to localize a variable number of objects in an image. VQD is related to visual referring expression recognition, where the task is to localize only one object. We describe the first dataset for VQD and we propose baseline algorithms that demonstrate the difficulty of the task compared to referring expression recognition.

READ FULL TEXT

page 1

page 5

research
09/11/2020

AttnGrounder: Talking to Cars with Attention

We propose Attention Grounder (AttnGrounder), a single-stage end-to-end ...
research
07/06/2018

Dynamic Multimodal Instance Segmentation guided by natural language queries

In this paper, we address the task of segmenting an object given a natur...
research
03/09/2021

Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning

In this paper, we are tackling the proposal-free referring expression gr...
research
05/30/2018

Visual Referring Expression Recognition: What Do Systems Actually Learn?

We present an empirical analysis of the state-of-the-art systems for ref...
research
03/17/2018

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

Localizing natural language phrases in images is a challenging problem t...
research
11/13/2015

Natural Language Object Retrieval

In this paper, we address the task of natural language object retrieval,...
research
10/19/2021

Come Again? Re-Query in Referring Expression Comprehension

To build a shared perception of the world, humans rely on the ability to...

Please sign up or login with your details

Forgot password? Click here to reset