Tracking Anything with Decoupled Video Segmentation

09/07/2023
by   Ho Kei Cheng, et al.
0

Training data for video segmentation are expensive to annotate. This impedes extensions of end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary settings. To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation. Due to this design, we only need an image-level model for the target task (which is cheaper to train) and a universal temporal propagation model which is trained once and generalizes across tasks. To effectively combine these two modules, we use bi-directional propagation for (semi-)online fusion of segmentation hypotheses from different frames to generate a coherent segmentation. We show that this decoupled formulation compares favorably to end-to-end approaches in several data-scarce tasks including large-vocabulary video panoptic segmentation, open-world video segmentation, referring video segmentation, and unsupervised video object segmentation. Code is available at: https://hkchengrex.github.io/Tracking-Anything-with-DEVA

READ FULL TEXT

page 1

page 4

page 8

page 15

page 16

research
12/16/2016

Video Propagation Networks

We propose a technique that propagates information forward through video...
research
07/11/2020

Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching

Significant progress has been made in Video Object Segmentation (VOS), t...
research
05/24/2019

OVSNet : Towards One-Pass Real-Time Video Object Segmentation

Video object segmentation aims at accurately segmenting the target objec...
research
07/30/2019

An Empirical Study of Propagation-based Methods for Video Object Segmentation

While propagation-based approaches have achieved state-of-the-art perfor...
research
04/23/2018

Switchable Temporal Propagation Network

Videos contain highly redundant information between frames. Such redunda...
research
03/12/2023

Universal Instance Perception as Object Discovery and Retrieval

All instance perception tasks aim at finding certain objects specified b...
research
01/22/2013

Efficient MRF Energy Propagation for Video Segmentation via Bilateral Filters

Segmentation of an object from a video is a challenging task in multimed...

Please sign up or login with your details

Forgot password? Click here to reset