Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation

04/03/2023
by   Yabo Zhang, et al.
0

In this work, we investigate performing semantic segmentation solely through the training on image-sentence pairs. Due to the lack of dense annotations, existing text-supervised methods can only learn to group an image into semantic regions via pixel-insensitive feedback. As a result, their grouped results are coarse and often contain small spurious regions, limiting the upper-bound performance of segmentation. On the other hand, we observe that grouped results from self-supervised models are more semantically consistent and break the bottleneck of existing methods. Motivated by this, we introduce associate self-supervised spatially-consistent grouping with text-supervised semantic segmentation. Considering the part-like grouped results, we further adapt a text-supervised model from image-level to region-level recognition with two core designs. First, we encourage fine-grained alignment with a one-way noun-to-region contrastive loss, which reduces the mismatched noun-region pairs. Second, we adopt a contextually aware masking strategy to enable simultaneous recognition of all grouped regions. Coupled with spatially-consistent grouping and region-adapted recognition, our method achieves 59.2 significantly surpassing the state-of-the-art methods.

READ FULL TEXT

page 3

page 4

page 5

page 7

page 8

page 12

page 13

page 14

research
10/12/2022

Dynamic Clustering Network for Unsupervised Semantic Segmentation

Recently, the ability of self-supervised Vision Transformer (ViT) to rep...
research
02/24/2022

Fully Self-Supervised Learning for Semantic Segmentation

In this work, we present a fully self-supervised framework for semantic ...
research
08/14/2023

ICPC: Instance-Conditioned Prompting with Contrastive Learning for Semantic Segmentation

Modern supervised semantic segmentation methods are usually finetuned ba...
research
07/22/2023

Morphology-inspired Unsupervised Gland Segmentation via Selective Semantic Grouping

Designing deep learning algorithms for gland segmentation is crucial for...
research
12/01/2022

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

We tackle open-world semantic segmentation, which aims at learning to se...
research
12/05/2020

Self-Supervised Visual Representation Learning from Hierarchical Grouping

We create a framework for bootstrapping visual representation learning f...
research
03/26/2022

Semantic Segmentation by Early Region Proxy

Typical vision backbones manipulate structured features. As a compromise...

Please sign up or login with your details

Forgot password? Click here to reset