A Novel Long-term Iterative Mining Scheme for Video Salient Object Detection

06/20/2022
by   Chenglizhao Chen, et al.
0

The existing state-of-the-art (SOTA) video salient object detection (VSOD) models have widely followed short-term methodology, which dynamically determines the balance between spatial and temporal saliency fusion by solely considering the current consecutive limited frames. However, the short-term methodology has one critical limitation, which conflicts with the real mechanism of our visual system – a typical long-term methodology. As a result, failure cases keep showing up in the results of the current SOTA models, and the short-term methodology becomes the major technical bottleneck. To solve this problem, this paper proposes a novel VSOD approach, which performs VSOD in a complete long-term way. Our approach converts the sequential VSOD, a sequential task, to a data mining problem, i.e., decomposing the input video sequence to object proposals in advance and then mining salient object proposals as much as possible in an easy-to-hard way. Since all object proposals are simultaneously available, the proposed approach is a complete long-term approach, which can alleviate some difficulties rooted in conventional short-term approaches. In addition, we devised an online updating scheme that can grasp the most representative and trustworthy pattern profile of the salient objects, outputting framewise saliency maps with rich details and smoothing both spatially and temporally. The proposed approach outperforms almost all SOTA models on five widely used benchmark datasets.

READ FULL TEXT

page 1

page 3

page 5

page 7

page 8

page 10

page 11

page 13

research
04/01/2020

Spatio-temporal Tubelet Feature Aggregation and Object Linking in Videos

This paper addresses the problem of how to exploit spatio-temporal infor...
research
08/30/2020

Finding Action Tubes with a Sparse-to-Dense Framework

The task of spatial-temporal action detection has attracted increasing a...
research
07/15/2022

Weakly Supervised Video Salient Object Detection via Point Supervision

Video salient object detection models trained on pixel-wise dense annota...
research
08/27/2019

Global-Local Temporal Representations For Video Person Re-Identification

This paper proposes the Global-Local Temporal Representation (GLTR) to e...
research
11/09/2020

TTVOS: Lightweight Video Object Segmentation with Adaptive Template Attention Module and Temporal Consistency Loss

Semi-supervised video object segmentation (semi-VOS) is widely used in m...
research
08/11/2016

Sequence Graph Transform (SGT): A Feature Extraction Function for Sequence Data Mining (Extended Version)

The ubiquitous presence of sequence data across fields such as the web, ...
research
07/24/2021

ASOD60K: Audio-Induced Salient Object Detection in Panoramic Videos

Exploring to what humans pay attention in dynamic panoramic scenes is us...

Please sign up or login with your details

Forgot password? Click here to reset