PhraseCut: Language-based Image Segmentation in the Wild

08/03/2020
by   Chenyun Wu, et al.
5

We consider the problem of segmenting image regions given a natural language phrase, and study it on a novel dataset of 77,262 images and 345,486 phrase-region pairs. Our dataset is collected on top of the Visual Genome dataset and uses the existing annotations to generate a challenging set of referring phrases for which the corresponding regions are manually annotated. Phrases in our dataset correspond to multiple regions and describe a large number of object and stuff categories as well as their attributes such as color, shape, parts, and relationships with other entities in the image. Our experiments show that the scale and diversity of concepts in our dataset poses significant challenges to the existing state-of-the-art. We systematically handle the long-tail nature of these concepts and present a modular approach to combine category, attribute, and relationship cues that outperforms existing approaches.

READ FULL TEXT

page 2

page 5

page 8

page 12

page 14

page 15

page 16

page 17

research
11/21/2016

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues

This paper presents a framework for localization or grounding of phrases...
research
10/01/2020

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

The task of video object segmentation with referring expressions (langua...
research
05/15/2020

ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language

Person search by natural language aims at retrieving a specific person i...
research
11/17/2018

Open-vocabulary Phrase Detection

Most existing work that grounds natural language phrases in images start...
research
06/17/2021

Learning to Predict Visual Attributes in the Wild

Visual attributes constitute a large portion of information contained in...
research
04/20/2021

Detector-Free Weakly Supervised Grounding by Separation

Nowadays, there is an abundance of data involving images and surrounding...
research
11/28/2018

CrowdCam: Dynamic Region Segmentation

We consider the problem of segmenting dynamic regions in CrowdCam images...

Please sign up or login with your details

Forgot password? Click here to reset