DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation

09/27/2019
by   Xiaohui Zeng, et al.
8

In this paper, we propose the differentiable mask-matching network (DMM-Net) for solving the video object segmentation problem where the initial object masks are provided. Relying on the Mask R-CNN backbone, we extract mask proposals per frame and formulate the matching between object templates and proposals at one time step as a linear assignment problem where the cost matrix is predicted by a CNN. We propose a differentiable matching layer by unrolling a projected gradient descent algorithm in which the projection exploits the Dykstra's algorithm. We prove that under mild conditions, the matching is guaranteed to converge to the optimum. In practice, it performs similarly to the Hungarian algorithm during inference. Meanwhile, we can back-propagate through it to learn the cost matrix. After matching, a refinement head is leveraged to improve the quality of the matched mask. Our DMM-Net achieves competitive results on the largest video object segmentation dataset YouTube-VOS. On DAVIS 2017, DMM-Net achieves the best performance without online learning on the first frames. Without any fine-tuning, DMM-Net performs comparably to state-of-the-art methods on SegTrack v2 dataset. At last, our matching layer is very simple to implement; we attach the PyTorch code (<50 lines) in the supplementary material. Our code is released at https://github.com/ZENGXH/DMM_Net.

READ FULL TEXT

page 1

page 4

page 8

page 13

page 15

research
07/09/2021

Fast Pixel-Matching for Video Object Segmentation

Video object segmentation, aiming to segment the foreground objects give...
research
12/03/2020

Make One-Shot Video Object Segmentation Efficient Again

Video object segmentation (VOS) describes the task of segmenting a set o...
research
02/17/2020

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation

Most recent semi-supervised video object segmentation (VOS) methods rely...
research
08/28/2019

Fast Video Object Segmentation via Mask Transfer Network

Accuracy and processing speed are two important factors that affect the ...
research
04/13/2023

Boosting Video Object Segmentation via Space-time Correspondence Learning

Current top-leading solutions for video object segmentation (VOS) typica...
research
04/10/2022

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation

This paper presents Video K-Net, a simple, strong, and unified framework...
research
04/06/2022

FocalClick: Towards Practical Interactive Image Segmentation

Interactive segmentation allows users to extract target masks by making ...

Please sign up or login with your details

Forgot password? Click here to reset