MUNet: Motion Uncertainty-aware Semi-supervised Video Object Segmentation

11/29/2021
by   Jiadai Sun, et al.
0

The task of semi-supervised video object segmentation (VOS) has been greatly advanced and state-of-the-art performance has been made by dense matching-based methods. The recent methods leverage space-time memory (STM) networks and learn to retrieve relevant information from all available sources, where the past frames with object masks form an external memory and the current frame as the query is segmented using the mask information in the memory. However, when forming the memory and performing matching, these methods only exploit the appearance information while ignoring the motion information. In this paper, we advocate the return of the motion information and propose a motion uncertainty-aware framework (MUNet) for semi-supervised VOS. First, we propose an implicit method to learn the spatial correspondences between neighboring frames, building upon a correlation cost volume. To handle the challenging cases of occlusion and textureless regions during constructing dense correspondences, we incorporate the uncertainty in dense matching and achieve motion uncertainty-aware feature representation. Second, we introduce a motion-aware spatial attention module to effectively fuse the motion feature with the semantic feature. Comprehensive experiments on challenging benchmarks show that using a small amount of data and combining it with powerful motion information can bring a significant performance boost. We achieve 76.5% 𝒥&ℱ only using DAVIS17 for training, which significantly outperforms the SOTA methods under the low-data protocol. The code will be released.

READ FULL TEXT

page 2

page 4

page 6

page 7

page 8

page 9

research
04/01/2019

Video Object Segmentation using Space-Time Memory Networks

We propose a novel solution for semi-supervised video object segmentatio...
research
07/27/2021

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

We propose a self-supervised spatio-temporal matching method coined Moti...
research
08/01/2022

BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation

Video Object Segmentation (VOS) is fundamental to video understanding. T...
research
09/23/2021

Hierarchical Memory Matching Network for Video Object Segmentation

We present Hierarchical Memory Matching Network (HMMN) for semi-supervis...
research
10/15/2020

Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement

We propose a new matching-based framework for semi-supervised video obje...
research
06/01/2021

TransVOS: Video Object Segmentation with Transformers

Recently, Space-Time Memory Network (STM) based methods have achieved st...
research
03/14/2021

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

We present Modular interactive VOS (MiVOS) framework which decouples int...

Please sign up or login with your details

Forgot password? Click here to reset