STFCN: Spatio-Temporal FCN for Semantic Video Segmentation

08/21/2016
by   Mohsen Fayyaz, et al.
0

This paper presents a novel method to involve both spatial and temporal features for semantic video segmentation. Current work on convolutional neural networks(CNNs) has shown that CNNs provide advanced spatial features supporting a very good performance of solutions for both image and video analysis, especially for the semantic segmentation task. We investigate how involving temporal features also has a good effect on segmenting video data. We propose a module based on a long short-term memory (LSTM) architecture of a recurrent neural network for interpreting the temporal characteristics of video frames over time. Our system takes as input frames of a video and produces a correspondingly-sized output; for segmenting the video our method combines the use of three components: First, the regional spatial features of frames are extracted using a CNN; then, using LSTM the temporal features are added; finally, by deconvolving the spatio-temporal features we produce pixel-wise predictions. Our key insight is to build spatio-temporal convolutional networks (spatio-temporal CNNs) that have an end-to-end architecture for semantic video segmentation. We adapted fully some known convolutional network architectures (such as FCN-AlexNet and FCN-VGG16), and dilated convolution into our spatio-temporal CNNs. Our spatio-temporal CNNs achieve state-of-the-art semantic segmentation, as demonstrated for the Camvid and NYUDv2 datasets.

READ FULL TEXT
research
03/23/2021

Enhanced Spatio-Temporal Interaction Learning for Video Deraining: A Faster and Better Framework

Video deraining is an important task in computer vision as the unwanted ...
research
06/24/2015

Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation

Convolutional Neural Networks (CNNs) can be shifted across 2D images or ...
research
10/19/2020

Noisy-LSTM: Improving Temporal Awareness for Video Semantic Segmentation

Semantic video segmentation is a key challenge for various applications....
research
07/29/2017

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

In this paper, we develop deep spatio-temporal neural networks to sequen...
research
09/04/2019

Deep learning networks for selection of persistent scatterer pixels in multi-temporal SAR interferometric processing

In multi-temporal SAR interferometry (MT-InSAR), persistent scatterer (P...
research
12/10/2021

ST-MTL: Spatio-Temporal Multitask Learning Model to Predict Scanpath While Tracking Instruments in Robotic Surgery

Representation learning of the task-oriented attention while tracking in...
research
01/08/2019

Multi-stream CNN based Video Semantic Segmentation for Automated Driving

Majority of semantic segmentation algorithms operate on a single frame e...

Please sign up or login with your details

Forgot password? Click here to reset