Semantic Segmentation of Video Sequences with Convolutional LSTMs

05/03/2019
by   Andreas Pfeuffer, et al.
0

Most of the semantic segmentation approaches have been developed for single image segmentation, and hence, video sequences are currently segmented by processing each frame of the video sequence separately. The disadvantage of this is that temporal image information is not considered, which improves the performance of the segmentation approach. One possibility to include temporal information is to use recurrent neural networks. However, there are only a few approaches using recurrent networks for video segmentation so far. These approaches extend the encoder-decoder network architecture of well-known segmentation approaches and place convolutional LSTM layers between encoder and decoder. However, in this paper it is shown that this position is not optimal, and that other positions in the network exhibit better performance. Nowadays, state-of-the-art segmentation approaches rarely use the classical encoder-decoder structure, but use multi-branch architectures. These architectures are more complex, and hence, it is more difficult to place the recurrent units at a proper position. In this work, the multi-branch architectures are extended by convolutional LSTM layers at different positions and evaluated on two different datasets in order to find the best one. It turned out that the proposed approach outperforms the pure CNN-based approach for up to 1.6 percent.

READ FULL TEXT

page 1

page 6

research
07/16/2019

Separable Convolutional LSTMs for Faster Video Segmentation

Semantic Segmentation is an important module for autonomous robots such ...
research
01/08/2019

Multi-stream CNN based Video Semantic Segmentation for Automated Driving

Majority of semantic segmentation algorithms operate on a single frame e...
research
11/29/2020

UVid-Net: Enhanced Semantic Segmentation of UAV Aerial Videos by Embedding Temporal Information

Semantic segmentation of aerial videos has been extensively used for dec...
research
06/30/2023

Obscured Wildfire Flame Detection By Temporal Analysis of Smoke Patterns Captured by Unmanned Aerial Systems

This research paper addresses the challenge of detecting obscured wildfi...
research
07/18/2023

PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks

For problems in image processing and many other fields, a large class of...
research
05/22/2017

TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Action segmentation as a milestone towards building automatic systems to...
research
10/12/2018

Improving Generalization of Sequence Encoder-Decoder Networks for Inverse Imaging of Cardiac Transmembrane Potential

Deep learning models have shown state-of-the-art performance in many inv...

Please sign up or login with your details

Forgot password? Click here to reset