TubeFormer-DeepLab: Video Mask Transformer

05/30/2022
by   Dahun Kim, et al.
0

We present TubeFormer-DeepLab, the first attempt to tackle multiple core video segmentation tasks in a unified manner. Different video segmentation tasks (e.g., video semantic/instance/panoptic segmentation) are usually considered as distinct problems. State-of-the-art models adopted in the separate communities have diverged, and radically different approaches dominate in each task. By contrast, we make a crucial observation that video segmentation tasks could be generally formulated as the problem of assigning different predicted labels to video tubes (where a tube is obtained by linking segmentation masks along the time axis) and the labels may encode different values depending on the target task. The observation motivates us to develop TubeFormer-DeepLab, a simple and effective video mask transformer model that is widely applicable to multiple video segmentation tasks. TubeFormer-DeepLab directly predicts video tubes with task-specific labels (either pure semantic categories, or both semantic categories and instance identities), which not only significantly simplifies video segmentation models, but also advances state-of-the-art results on multiple video segmentation benchmarks

READ FULL TEXT

page 1

page 8

page 11

page 12

page 13

page 14

research
07/13/2021

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Modern approaches typically formulate semantic segmentation as a per-pix...
research
01/06/2023

TarViS: A Unified Approach for Target-based Video Segmentation

The general domain of video segmentation is currently fragmented into di...
research
04/10/2023

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

Video Panoptic Segmentation (VPS) aims to achieve comprehensive pixel-le...
research
10/12/2022

A Generalist Framework for Panoptic Segmentation of Images and Videos

Panoptic segmentation assigns semantic and instance ID labels to every p...
research
06/08/2021

Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation

Panoptic segmentation brings together two separate tasks: instance and s...
research
12/16/2016

Video Propagation Networks

We propose a technique that propagates information forward through video...
research
05/03/2023

CLUSTSEG: Clustering for Universal Segmentation

We present CLUSTSEG, a general, transformer-based framework that tackles...

Please sign up or login with your details

Forgot password? Click here to reset