Boosting Video Object Segmentation based on Scale Inconsistency

05/02/2022
by   Hengyi Wang, et al.
2

We present a refinement framework to boost the performance of pre-trained semi-supervised video object segmentation (VOS) models. Our work is based on scale inconsistency, which is motivated by the observation that existing VOS models generate inconsistent predictions from input frames with different sizes. We use the scale inconsistency as a clue to devise a pixel-level attention module that aggregates the advantages of the predictions from different-size inputs. The scale inconsistency is also used to regularize the training based on a pixel-level variance measured by an uncertainty estimation. We further present a self-supervised online adaptation, tailored for test-time optimization, that bootstraps the predictions without ground-truth masks based on the scale inconsistency. Experiments on DAVIS 16 and DAVIS 17 datasets show that our framework can be generically applied to various VOS models and improve their performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
06/24/2022

The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge–Track 3: Referring Video Object Segmentation

The referring video object segmentation task (RVOS) aims to segment obje...
research
07/07/2023

Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification Segmentation

We address the task of weakly-supervised few-shot image classification a...
research
09/18/2020

PMVOS: Pixel-Level Matching-Based Video Object Segmentation

Semi-supervised video object segmentation (VOS) aims to segment arbitrar...
research
09/17/2018

DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation

Pixel-level annotation demands expensive human efforts and limits the pe...
research
02/14/2022

Box Supervised Video Segmentation Proposal Network

Video Object Segmentation (VOS) has been targeted by various fully-super...
research
09/29/2019

RPM-Net: Robust Pixel-Level Matching Networks for Self-Supervised Video Object Segmentation

In this paper, we introduce a self-supervised approach for video object ...
research
07/10/2018

Unsupervised Domain Adaptation for Automatic Estimation of Cardiothoracic Ratio

The cardiothoracic ratio (CTR), a clinical metric of heart size in chest...

Please sign up or login with your details

Forgot password? Click here to reset