Video Semantic Salient Instance Segmentation: Benchmark Dataset and Baseline

07/04/2018
by   Trung-Nghia Le, et al.
2

This paper pushes the envelope on salient regions in a video to decompose them into semantically meaningful components, semantic salient instances. To address this video semantic salient instance segmentation, we construct a new dataset, Semantic Salient Instance Video (SESIV) dataset. Our SESIV dataset consists of 84 high-quality video sequences with pixel-wisely per-frame ground-truth labels annotated for different segmentation tasks. We also provide a baseline for this problem, called Fork-Join Strategy (FJS). FJS is a two-stream network leveraging advantages of two different segmentation tasks, i.e., semantic instance segmentation and salient object segmentation. In FJS, we introduce a sequential fusion that combines the outputs of the two streams to have non-overlapping instances one by one. We also introduce a recurrent instance propagation to refine the shapes and semantic meanings of instances, and an identity tracking to maintain both the identity and the semantic meaning of an instance over the entire video. Experimental results demonstrated the effectiveness of our proposed FJS.

READ FULL TEXT

page 1

page 3

page 4

page 9

research
03/31/2021

Camouflaged Instance Segmentation: Dataset and Benchmark Suite

This paper pushes the envelope on camouflaged regions to decompose them ...
research
05/12/2019

Video Instance Segmentation

In this paper we present a new computer vision task, named video instanc...
research
09/29/2019

Salient Instance Segmentation via Subitizing and Clustering

The goal of salient region detection is to identify the regions of an im...
research
08/20/2021

BlockCopy: High-Resolution Video Processing with Block-Sparse Feature Propagation and Online Policies

In this paper we propose BlockCopy, a scheme that accelerates pretrained...
research
11/17/2022

TrafficCAM: A Versatile Dataset for Traffic Flow Segmentation

Traffic flow analysis is revolutionising traffic management. Qualifying ...
research
04/20/2018

ConnNet: A Long-Range Relation-Aware Pixel-Connectivity Network for Salient Segmentation

Salient segmentation aims to segment out attention-grabbing regions, a c...
research
10/12/2021

Reliable Shot Identification for Complex Event Detection via Visual-Semantic Embedding

Multimedia event detection is the task of detecting a specific event of ...

Please sign up or login with your details

Forgot password? Click here to reset