In defense of OSVOS

08/19/2019
by   Yu Liu, et al.
7

As a milestone for video object segmentation, one-shot video object segmentation (OSVOS) has achieved a large margin compared to the conventional optical-flow based methods regarding to the segmentation accuracy. Its excellent performance mainly benefit from the three-step training mechanism, that are: (1) acquiring object features on the base dataset (i.e. ImageNet), (2) training the parent network on the training set of the target dataset (i.e. DAVIS-2016) to be capable of differentiating the object of interest from the background. (3) online fine-tuning the interested object on the first frame of the target test set to overfit its appearance, then the model can be utilized to segment the same object in the rest frames of that video. In this paper, we argue that for the step (2), OSVOS has the limitation to 'overemphasize' the generic semantic object information while 'dilute' the instance cues of the object(s), which largely block the whole training process. Through adding a common module, video loss, which we formulate with various forms of constraints (including weighted BCE loss, high-dimensional triplet loss, as well as a novel mixed instance-aware video loss), to train the parent network in the step (2), the network is then better prepared for the step (3), i.e. online fine-tuning on the target instance. Through extensive experiments using different network structures as the backbone, we show that the proposed video loss module can improve the segmentation performance significantly, compared to that of OSVOS. Meanwhile, since video loss is a common module, it can be generalized to other fine-tuning based methods and similar vision tasks such as depth estimation and saliency detection.

READ FULL TEXT

page 2

page 5

page 9

research
02/17/2020

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation

Most recent semi-supervised video object segmentation (VOS) methods rely...
research
02/25/2019

FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation

Many of the recent successful methods for video object segmentation (VOS...
research
11/28/2018

A Generative Appearance Model for End-to-end Video Object Segmentation

One of the fundamental challenges in video object segmentation is to fin...
research
11/14/2021

Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks

Computer vision tasks can benefit from the estimation of the salient obj...
research
04/18/2019

Discriminative Online Learning for Fast Video Object Segmentation

We address the highly challenging problem of video object segmentation. ...
research
09/04/2018

VideoMatch: Matching based Video Object Segmentation

Video object segmentation is challenging yet important in a wide variety...
research
08/01/2017

Video Object Segmentation with Re-identification

Conventional video segmentation methods often rely on temporal continuit...

Please sign up or login with your details

Forgot password? Click here to reset