PolyFormer: Referring Image Segmentation as Sequential Polygon Generation

02/14/2023
by   Jiang Liu, et al.
0

In this work, instead of directly predicting the pixel-level segmentation masks, the problem of referring image segmentation is formulated as sequential polygon generation, and the predicted polygons can be later converted into segmentation masks. This is enabled by a new sequence-to-sequence framework, Polygon Transformer (PolyFormer), which takes a sequence of image patches and text query tokens as input, and outputs a sequence of polygon vertices autoregressively. For more accurate geometric localization, we propose a regression-based decoder, which predicts the precise floating-point coordinates directly, without any coordinate quantization error. In the experiments, PolyFormer outperforms the prior art by a clear margin, e.g., 5.40 absolute improvements on the challenging RefCOCO+ and RefCOCOg datasets. It also shows strong generalization ability when evaluated on the referring video segmentation task without fine-tuning, e.g., achieving competitive 61.5 the Ref-DAVIS17 dataset.

READ FULL TEXT

page 2

page 3

page 4

page 8

page 13

page 14

research
06/06/2023

DFormer: Diffusion-guided Transformer for Universal Image Segmentation

This paper introduces an approach, named DFormer, for universal image se...
research
12/21/2022

Generalized Decoding for Pixel, Image, and Language

We present X-Decoder, a generalized decoding model that can predict pixe...
research
03/11/2022

Hyperbolic Image Segmentation

For image segmentation, the current standard is to perform pixel-level o...
research
12/22/2021

Open-Vocabulary Image Segmentation

We design an open-vocabulary image segmentation model to organize an ima...
research
04/25/2020

Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Video Object Segmentation (VOS) is an active research area of the visual...
research
11/24/2016

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

Semantic image segmentation is an essential component of modern autonomo...
research
12/18/2021

Prompt-Based Multi-Modal Image Segmentation

Image segmentation is usually addressed by training a model for a fixed ...

Please sign up or login with your details

Forgot password? Click here to reset