Look Before You Match: Instance Understanding Matters in Video Object Segmentation

12/13/2022
by   Junke Wang, et al.
0

Exploring dense matching between the current frame and past frames for long-range context modeling, memory-based methods have demonstrated impressive results in video object segmentation (VOS) recently. Nevertheless, due to the lack of instance understanding ability, the above approaches are oftentimes brittle to large appearance variations or viewpoint changes resulted from the movement of objects and cameras. In this paper, we argue that instance understanding matters in VOS, and integrating it with memory-based matching can enjoy the synergy, which is intuitively sensible from the definition of VOS task, , identifying and segmenting object instances within the video. Towards this goal, we present a two-branch network for VOS, where the query-based instance segmentation (IS) branch delves into the instance details of the current frame and the VOS branch performs spatial-temporal matching with the memory bank. We employ the well-learned object queries from IS branch to inject instance-specific information into the query key, with which the instance-augmented matching is further performed. In addition, we introduce a multi-path fusion block to effectively combine the memory readout with multi-scale features from the instance segmentation decoder, which incorporates high-resolution instance-aware features to produce final segmentation results. Our method achieves state-of-the-art performance on DAVIS 2016/2017 val (92.6 and 87.1 and 86.3

READ FULL TEXT

page 1

page 3

page 6

page 7

page 12

research
04/13/2021

Crossover Learning for Fast Online Video Instance Segmentation

Modeling temporal visual context across frames is critical for video ins...
research
11/20/2019

Object-Guided Instance Segmentation for Biological Images

Instance segmentation of biological images is essential for studying obj...
research
08/03/2022

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

We propose MinVIS, a minimal video instance segmentation (VIS) framework...
research
07/22/2022

DeVIS: Making Deformable Transformers Work for Video Instance Segmentation

Video Instance Segmentation (VIS) jointly tackles multi-object detection...
research
04/10/2021

Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation

This paper addresses the task of unsupervised video multi-object segment...
research
02/15/2023

Offline-to-Online Knowledge Distillation for Video Instance Segmentation

In this paper, we present offline-to-online knowledge distillation (OOKD...
research
12/03/2019

Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation

This paper presents a method for automatic video object segmentation bas...

Please sign up or login with your details

Forgot password? Click here to reset