Reliability-Hierarchical Memory Network for Scribble-Supervised Video Object Segmentation

03/25/2023
by   Zikun Zhou, et al.
0

This paper aims to solve the video object segmentation (VOS) task in a scribble-supervised manner, in which VOS models are not only trained by the sparse scribble annotations but also initialized with the sparse target scribbles for inference. Thus, the annotation burdens for both training and initialization can be substantially lightened. The difficulties of scribble-supervised VOS lie in two aspects. On the one hand, it requires the powerful ability to learn from the sparse scribble annotations during training. On the other hand, it demands strong reasoning capability during inference given only a sparse initial target scribble. In this work, we propose a Reliability-Hierarchical Memory Network (RHMNet) to predict the target mask in a step-wise expanding strategy w.r.t. the memory reliability level. To be specific, RHMNet first only uses the memory in the high-reliability level to locate the region with high reliability belonging to the target, which is highly similar to the initial target scribble. Then it expands the located high-reliability region to the entire target conditioned on the region itself and the memories in all reliability levels. Besides, we propose a scribble-supervised learning mechanism to facilitate the learning of our model to predict dense results. It mines the pixel-level relation within the single frame and the frame-level relation within the sequence to take full advantage of the scribble annotations in sequence training samples. The favorable performance on two popular benchmarks demonstrates that our method is promising.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 10

research
09/17/2018

DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation

Pixel-level annotation demands expensive human efforts and limits the pe...
research
07/16/2020

Kernelized Memory Network for Video Object Segmentation

Semi-supervised video object segmentation (VOS) is a task that involves ...
research
10/04/2021

Pixel-Level Bijective Matching for Video Object Segmentation

Semi-supervised video object segmentation (VOS) aims to track the design...
research
08/03/2022

Per-Clip Video Object Segmentation

Recently, memory-based approaches show promising results on semi-supervi...
research
07/29/2023

XMem++: Production-level Video Segmentation From Few Annotated Frames

Despite advancements in user-guided video segmentation, extracting compl...
research
02/08/2022

Consistency-Regularized Region-Growing Network for Semantic Segmentation of Urban Scenes with Point-Level Annotations

Deep learning algorithms have obtained great success in semantic segment...
research
05/27/2021

3D Segmentation Learning from Sparse Annotations and Hierarchical Descriptors

One of the main obstacles to 3D semantic segmentation is the significant...

Please sign up or login with your details

Forgot password? Click here to reset