SIGN: Spatial-information Incorporated Generative Network for Generalized Zero-shot Semantic Segmentation

08/27/2021
by   Jiaxin Cheng, et al.
11

Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction with surrounding context motivates us to incorporate spatial information using positional encoding. We improve standard positional encoding by introducing the concept of Relative Positional Encoding, which integrates spatial information at the feature level and can handle arbitrary image sizes. Furthermore, while self-training is widely used in zero-shot semantic segmentation to generate pseudo-labels, we propose a new knowledge-distillation-inspired self-training strategy, namely Annealed Self-Training, which can automatically assign different importance to pseudo-labels to improve performance. We systematically study the proposed Relative Positional Encoding and Annealed Self-Training in a comprehensive experimental evaluation, and our empirical results confirm the effectiveness of our method on three benchmark datasets.

READ FULL TEXT

page 2

page 3

page 7

page 8

page 12

page 13

page 14

research
06/03/2019

Zero-Shot Semantic Segmentation

Semantic segmentation models are limited in their ability to scale to la...
research
07/01/2020

Learning unbiased zero-shot semantic segmentation networks via transductive transfer

Semantic segmentation, which aims to acquire a detailed understanding of...
research
01/18/2023

Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic Segmentation

Recent mask proposal models have significantly improved the performance ...
research
02/26/2021

Recursive Training for Zero-Shot Semantic Segmentation

General purpose semantic segmentation relies on a backbone CNN network t...
research
01/28/2023

ZegOT: Zero-shot Segmentation Through Optimal Transport of Text Prompts

Recent success of large-scale Contrastive Language-Image Pre-training (C...
research
03/23/2019

Residual Pyramid Learning for Single-Shot Semantic Segmentation

Pixel-level semantic segmentation is a challenging task with a huge amou...
research
06/11/2021

Conterfactual Generative Zero-Shot Semantic Segmentation

zero-shot learning is an essential part of computer vision. As a classic...

Please sign up or login with your details

Forgot password? Click here to reset