Learning to Associate Every Segment for Video Panoptic Segmentation

06/17/2021
by   Sanghyun Woo, et al.
0

Temporal correspondence - linking pixels or objects across frames - is a fundamental supervisory signal for the video models. For the panoptic understanding of dynamic scenes, we further extend this concept to every segment. Specifically, we aim to learn coarse segment-level matching and fine pixel-level matching together. We implement this idea by designing two novel learning objectives. To validate our proposals, we adopt a deep siamese model and train the model to learn the temporal correspondence on two different levels (i.e., segment and pixel) along with the target task. At inference time, the model processes each frame independently without any extra computation and post-processing. We show that our per-frame inference model can achieve new state-of-the-art results on Cityscapes-VPS and VIPER datasets. Moreover, due to its high efficiency, the model runs in a fraction of time (3x) compared to the previous state-of-the-art approach.

READ FULL TEXT

page 4

page 5

page 8

research
02/26/2020

Efficient Semantic Video Segmentation with Per-frame Inference

In semantic segmentation, most existing real-time deep models trained wi...
research
09/06/2021

The Animation Transformer: Visual Correspondence via Segment Matching

Visual correspondence is a fundamental building block on the way to buil...
research
04/10/2022

Learning Pixel-Level Distinctions for Video Highlight Detection

The goal of video highlight detection is to select the most attractive s...
research
09/14/2017

Learning to Segment Instances in Videos with Spatial Propagation Network

We propose a deep learning-based framework for instance-level object seg...
research
07/11/2019

Object Detection in Video with Spatial-temporal Context Aggregation

Recent cutting-edge feature aggregation paradigms for video object detec...
research
09/14/2021

Adaptive Proposal Generation Network for Temporal Sentence Localization in Videos

We address the problem of temporal sentence localization in videos (TSLV...
research
03/06/2021

Simultaneously Localize, Segment and Rank the Camouflaged Objects

Camouflage is a key defence mechanism across species that is critical to...

Please sign up or login with your details

Forgot password? Click here to reset