Zero-shot Referring Image Segmentation with Global-Local Context Features

03/31/2023
by   Seonghoon Yu, et al.
0

Referring image segmentation (RIS) aims to find a segmentation mask given a referring expression grounded to a region of the input image. Collecting labelled datasets for this task, however, is notoriously costly and labor-intensive. To overcome this issue, we propose a simple yet effective zero-shot referring image segmentation method by leveraging the pre-trained cross-modal knowledge from CLIP. In order to obtain segmentation masks grounded to the input text, we propose a mask-guided visual encoder that captures global and local contextual information of an input image. By utilizing instance masks obtained from off-the-shelf mask proposal techniques, our method is able to segment fine-detailed Istance-level groundings. We also introduce a global-local text encoder where the global feature captures complex sentence-level semantics of the entire input expression while the local feature focuses on the target noun phrase extracted by a dependency parser. In our experiments, the proposed method outperforms several zero-shot baselines of the task and even the weakly supervised referring expression segmentation method with substantial margins. Our code is available at https://github.com/Seonghoon-Yu/Zero-shot-RIS.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 12

page 13

page 15

research
08/31/2023

Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models

Zero-shot referring image segmentation is a challenging task because it ...
research
12/01/2022

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

We tackle open-world semantic segmentation, which aims at learning to se...
research
07/03/2023

Segment Anything Meets Point Tracking

The Segment Anything Model (SAM) has established itself as a powerful ze...
research
04/03/2023

Zero-Shot Semantic Segmentation with Decoupled One-Pass Network

Recently, the zero-shot semantic segmentation problem has attracted incr...
research
12/18/2021

Prompt-Based Multi-Modal Image Segmentation

Image segmentation is usually addressed by training a model for a fixed ...
research
08/11/2023

FoodSAM: Any Food Segmentation

In this paper, we explore the zero-shot capability of the Segment Anythi...
research
03/01/2023

Unlimited-Size Diffusion Restoration

Recently, using diffusion models for zero-shot image restoration (IR) ha...

Please sign up or login with your details

Forgot password? Click here to reset