BSUV-Net 2.0: Spatio-Temporal Data Augmentations for Video-AgnosticSupervised Background Subtraction

01/23/2021

∙

Background subtraction (BGS) is a fundamental video processing task which is a key component of many applications. Deep learning-based supervised algorithms achieve very promising results in BGS, however, most of these algorithms are optimized for either a specific video or a group of videos, and their performance decreases significantly when applied to unseen videos. Recently, several papers addressed this problem and proposed video-agnostic supervised BGS algorithms. However, nearly all of the data augmentations used in these works are limited to spatial domain and do not account for temporal variations naturally occurring in video data. In this work, we introduce spatio-temporal data augmentations and apply it to one of the leading video-agnostic BGS algorithms, BSUV-Net. Our new model trained using the proposed data augmentations, named BSUV-Net 2.0, significantly outperforms the state-of-the-art algorithms evaluated on unseen videos. We also develop a real-time variant of our model named Fast BSUV-Net 2.0 with performance close to the state-of-the-art. Furthermore, we introduce a new cross-validation training and evaluation strategy for the CDNet-2014 dataset that makes it possible to fairly and easily compare the performance of various video-agnostic supervised BGS algorithms. The source code of BSUV-Net 2.0 will be published.

READ FULL TEXT

BSUV-Net 2.0: Spatio-Temporal Data Augmentations for Video-AgnosticSupervised Background Subtraction

Sign in with Google

Consider DeepAI Pro