A Foreground Inference Network for Video Surveillance Using Multi-View Receptive Field

by   Thangarajah Akilan, et al.

Foreground (FG) pixel labelling plays a vital role in video surveillance. Recent engineering solutions have attempted to exploit the efficacy of deep learning (DL) models initially targeted for image classification to deal with FG pixel labelling. One major drawback of such strategy is the lacking delineation of visual objects when training samples are limited. To grapple with this issue, we introduce a multi-view receptive field fully convolutional neural network (MV-FCN) that harness recent seminal ideas, such as, fully convolutional structure, inception modules, and residual networking. Therefrom, we implement a system in an encoder-decoder fashion that subsumes a core and two complementary feature flow paths. The model exploits inception modules at early and late stages with three different sizes of receptive fields to capture invariance at various scales. The features learned in the encoding phase are fused with appropriate feature maps in the decoding phase through residual connections for achieving enhanced spatial representation. These multi-view receptive fields and residual feature connections are expected to yield highly generalized features for an accurate pixel-wise FG region identification. It is, then, trained with database specific exemplary segmentations to predict desired FG objects. The comparative experimental results on eleven benchmark datasets validate that the proposed model achieves very competitive performance with the prior- and state-of-the-art algorithms. We also report that how well a transfer learning approach can be useful to enhance the performance of our proposed MV-FCN.


page 7

page 8

page 9

page 10

page 11


Fast LIDAR-based Road Detection Using Fully Convolutional Neural Networks

In this work, a deep learning approach has been developed to carry out r...

Hyperspectral Image Classification with Spatial Consistence Using Fully Convolutional Spatial Propagation Network

In recent years, deep convolutional neural networks (CNNs) have shown im...

Investigations of the Influences of a CNN's Receptive Field on Segmentation of Subnuclei of Bilateral Amygdalae

Segmentation of objects with various sizes is relatively less explored i...

A Dilated Inception Network for Visual Saliency Prediction

Recently, with the advent of deep convolutional neural networks (DCNN), ...

Pixel-Wise PolSAR Image Classification via a Novel Complex-Valued Deep Fully Convolutional Network

Although complex-valued (CV) neural networks have shown better classific...

Pixel-aware Deep Function-mixture Network for Spectral Super-Resolution

Spectral super-resolution (SSR) aims at generating a hyperspectral image...

DEEPF0: End-To-End Fundamental Frequency Estimation for Music and Speech Signals

We propose a novel pitch estimation technique called DeepF0, which lever...