Learning to Detect Violent Videos using Convolutional Long Short-Term Memory

09/19/2017
by   Swathikiran Sudhakaran, et al.
0

Developing a technique for the automatic analysis of surveillance videos in order to identify the presence of violence is of broad interest. In this work, we propose a deep neural network for the purpose of recognizing violent videos. A convolutional neural network is used to extract frame level features from a video. The frame level features are then aggregated using a variant of the long short term memory that uses convolutional gates. The convolutional neural network along with the convolutional long short term memory is capable of capturing localized spatio-temporal features which enables the analysis of local motion taking place in the video. We also propose to use adjacent frame differences as the input to the model thereby forcing it to encode the changes occurring in the video. The performance of the proposed feature extraction pipeline is evaluated on three standard benchmark datasets in terms of recognition accuracy. Comparison of the results obtained with the state of the art techniques revealed the promising capability of the proposed method in recognizing violent videos.

READ FULL TEXT
research
09/19/2017

Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions

In this paper, we present a novel deep learning based approach for addre...
research
11/16/2018

Relational Long Short-Term Memory for Video Action Recognition

Spatial and temporal relationships, both short-range and long-range, bet...
research
04/15/2022

Detecting Violence in Video Based on Deep Features Fusion Technique

With the rapid growth of surveillance cameras in many public places to m...
research
02/19/2020

SummaryNet: A Multi-Stage Deep Learning Model for Automatic Video Summarisation

Video summarisation can be posed as the task of extracting important par...
research
02/20/2019

Dynamic Matrix Decomposition for Action Recognition

Designing a technique for the automatic analysis of different actions in...
research
05/31/2018

Lip Reading Using Convolutional Auto Encoders as Feature Extractor

Visual recognition of speech using the lip movement is called Lip-readin...

Please sign up or login with your details

Forgot password? Click here to reset