Anomaly Detection in Road Traffic Using Visual Surveillance: A Survey

01/24/2019 ∙ by Santhosh Kelathodi Kumaran, et al. ∙ Indian Institute of Technology Bhubaneswar IIT Roorkee 12

Computer vision has evolved in the last decade as a key technology for numerous applications replacing human supervision. In this paper, we present a survey on relevant visual surveillance related researches for anomaly detection in public places, focusing primarily on roads. Firstly, we revisit the surveys done in the last 10 years in this field. Since the underlying building block of a typical anomaly detection is learning, we emphasize more on learning methods applied on video scenes. We then summarize the important contributions made during last six years on anomaly detection primarily focusing on features, underlying techniques, applied scenarios and types of anomalies using single static camera. Finally, we discuss the challenges in the computer vision related anomaly detection techniques and some of the important future possibilities.



There are no comments yet.


page 2

page 5

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

With the widespread use of surveillance cameras in public places, computer vision-based scene understanding has gained a lot of popularity amongst the CV research community. Visual data contains rich information compared to other information sources such as GPS, mobile location, radar signals, etc. Thus, it can play a vital role in detecting/predicting congestions, accidents and other anomalies apart from collecting statistical information about the status of road traffic.

Fig. 1: Overview of a typical anomaly detection scheme. Preprocessing block extracts features/data in the form of descriptors. The normal behavior is represented in abstract form in terms of rules, models, or data repository. Specific anomaly detection techniques are used for detecting anomalies using anomaly scoring or labeling mechanism.
Fig. 2: Visual snapshots of some of the state-of-art anomaly detection techniques to present an overview about the survey. (a) Accident detection using Motion Interaction Field (MIF) [211]. (b) Anomaly detection using topic-based models [138]. Top row shows a vehicle that crossed the stop line, middle row represents a jaywalking scenario and the bottom row represents a vehicle taking unusual turn. (c) Real world anomaly detection using multiple instance learning (MIL) [168]. The anomaly detection is measured using an anomaly score in explosion scene. (d) Presence of a vehicle on a walkway detected using spatio-temporal adversarial networks (STAN) [92]. The top row represents the anomaly visualization from the generator and the bottom row represents the anomaly visualization from the discriminator.

Several computer vision-based studies have been conducted focusing on data acquisition [175]

, feature extraction 

[80, 164], scene learning [124, 14, 67, 36]

, activity learning 

[181], behavioral understanding [162, 15], etc. These studies primarily discuss on aspects such as scene analysis, video processing techniques, anomaly detection methods, vehicle detection and tracking, multi camera-based techniques and challenges, activity recognition, traffic monitoring, human behavior analysis, emergency management, event detection, etc.

Anomaly detection is a sub-domain of behavior understanding [175] from surveillance scenes. Anomalies are typically aberrations of scene entities (vehicles, human or the environment) from the normal behavior. With the availability of video feeds from public places, there has been a surge in the research outputs on video analysis and anomaly detection [162, 164, 158, 115]

. Typically anomaly detection methods learn the normal behavior via training. Anything deviating significantly from the normal behavior can be termed as anomalous. Vehicle presence on walkways, a sudden dispersal of people within a gathering, a person falling suddenly while walking, jaywalking, signal bypassing at a traffic junction, or U-turn of vehicles during red signals are a few examples of anomalies. Anomaly detection frameworks typically use unsupervised, semi-supervised or unsupervised learning. In this survey, we mainly explore anomaly detection techniques used in road traffic scenarios focusing on

entities such as vehicles, pedestrian, environment and their interactions.

We have noted that scope of the study should cover the nature of input data and their representations, feasibility of supervised learning, types of anomalies, suitability of the techniques in application contexts, anomaly detection outputs and evaluation criteria. We present this survey from the above perspectives. A typical anomaly detection framework is presented in Fig. 

1. Usually, anomaly detection systems work by learning the normal data patterns to build a normal profile. Once the normal patterns are learned, anomalies can be detected with the help of established approaches [137, 97]. Output of the system can be a score typically in the form of a metric or a label that notifies whether the data is anomalous or not.

Some examples of anomaly detection results are shown in Fig. 2.

I-a Recent Surveys

During last 10 years or so, a few interesting surveys have been published in this field of research. Authors of [124] have explored object detection, tracking, scene modeling and activity analysis using video trajectories. The study presented in [176] covers vehicle detection, tracking, behavior understanding and incident detection from the purview of intelligent transportation systems (ITS). Authors of [26] have conducted an in-depth study of traffic analysis frameworks under different taxonomies with pointers at integrating information from multiple sensors. The review presented in [164] is possibly the first work covering anomaly detection techniques. It covers sensors, entities, feature extraction methods, learning methods and scene modeling to detect anomalies. In [162], an object oriented approach from the perspective of vehicle mounted sensors for object detection, tracking and behavior analysis detailing the progress of the last decade of works, has been presented. Multi-camera study presented in [194] covers the researches related to surveillance in multi-camera setups. Authors of [171] discuss events, which are considered as a subset of anomalous events, requiring immediate attention, occuring unintentionally, abruptly and unexpectedly. The research presented in [144] discusses safety, security and law enforcement related applications from the computer vision perspective. The review presented in [181] discusses the elements of human activity and behavioral understanding frameworks. Authors of [25] present the researches on human behavioral understanding through actions and interactions of human entities. Intelligent video systems covering analytics aspect has been studied in [105]. Surveillance systems with specific application areas have been presented in [213]. Authors of [175] systematically divide road traffic analysis into four layers, namely image acquisition, dynamic and static attribute extraction, behavioral understanding and ITS services. Datasets used for anomaly detections have been covered in [140]. Traffic monitoring using different types of sensors has been discussed in [41]. Algorithms used for spatio-temporal point detections and their applications in vision domain have been covered in [101]. Traffic entities have been studied from the perspective of safety in [158]. Authors of [8] explore studies on video trajectory-based analysis and applications. Authors of [110] discuss various ways of handling emergency situations by assessing the risks, preparedness, response, recovery and mitigation using the extracted information from the visual features with the help of various learning mechanisms. In [115], authors have presented anomalous human behavior recognition work with focus on behavior representation and modeling, feature extraction techniques, classification and behavior modeling frameworks, performance evaluation techniques, and datasets with examples of video surveillance systems. Table I summarizes the major computer vision-based studies done during last 10 years. In our survey, we particularly focus on the studies on anomaly detection that are relevant on road traffic scenarios.

Anomalies are contextual in nature. The assumptions used in anomaly detections cannot be applied universally across different traffic scenarios. We analyze the capabilities of anomaly detection methods used in road traffic surveillance from the perspective of data. In the process, we categorize the methods according to scene representation, employed features, used models and approaches.

Ref. Focus Explored research areas
Morris (2008) [124] Video trajectory-based scene analysis Scene modeling: Tracking, interest point study, activity path learning; Applications: People movement, traffic, parking lot, and entity interaction; Path learning: Preprocessing (normalization and dimensionality reduction), clustering approaches and used distance measures, path modeling, relevance of path feedback in low level systems; Activity analysis: Virtual fencing, speed profiling, path classification, abnormality detection, online activity analysis, object interaction characterization.
Tian (2011) [176] Video processing techniques applied for traffic monitoring Traffic parameters collection; Traffic incident detection; Vehicle detection scenarios: Background modeling and non-background modeling approaches, shadow detection and removal; Vehicle tracking, model-based classification, region, deformable template and feature study, tracking algorithms; Traffic incident detection and behavior understanding.
Buch (2011) [26]) Video analytics system for urban traffic

Applications: Vehicle counting, automatic number plate recognition, incident detection; Analytics system components; Foreground segmentation techniques: Frame differencing, background subtraction (averaging, single Gaussian, mode estimation, Kalman filter, wavelets), GMM, graph cuts, shadow removal, object-based segmentation; Top-down vehicle classification: Features (region based, contour based), machine learning techniques; Bottom-up approaches: Interest point descriptors, object classification; Tracking: Kalman filter, PF, S-T MRF, graph correspondence, event cones; Traffic analytic system: Urban (camera domain, three dimensional modeling), highways (detection and classification).

Sodemann (2012) [164] Anomaly detection

Study on sensors: Visible-spectrum camera (low-level feature extraction and object level feature extraction), audio and infrared sensors; Learning methods: Unsupervised, supervised and apriori modeling; Classification algorithms: Dynamic bayesian networks, bayesian topic models, artificial neural networks, clustering, decision trees, fuzzy reasoning.

Sivaraman (2013) [162] Vision-based vehicle detection, tracking and behavior analysis Sensors: radar, lidar, camera; Vehicle detection: Monocular vision (camera placement, appearance features and classification, motion based approaches, vehicle pose). Stereo vision (matching, motion-based approaches); Vehicle tracking: Monocular and stereo tracking, vision cue fusion, real-time challenges and system architecture, fusion with other modalities; Behavior analysis: context, vehicle maneuvers, trajectories, behavioral classification; Future direction of vehicle detection, tracking, their on-road behavior and public benchmarks.
Wang (2013) [194] Multi-camera based surveillance Multi-camera calibration; Topology computation; Multi-camera object tracking: Calibration, appearance cues, correspondence-based methods; Object re-identification: Feature studies, learning methods; Multi-camera activity analysis: Correspondence free methods, activity models, human action recognition; Cooperative video surveillance using active and static cameras; Background modeling and object tracking with active cameras.
Suriani (2013) [171] Abrupt event detection Human centered, vehicle centered and small area centered studies; Methods of detection: Single person, multiple person, vehicles, multi-view camera based.
Loce (2013) [144] Traffic management Vehicle mounted camera-based safety applications: Lane departure warning and lane change assistance, pedestrian detection, driver monitoring, adaptive warning systems; Efficiency studies: Traffic flow management, incident management, video based tolling; Security management: Alert and warning systems, traffic surveillance, recognizing and tracking vehicles of interest; Law enforcement: Studies on speed enforcement, violation detection at road intersections, vehicle mounted mobile camera based vehicle identification.
Vishwakarma (2013) [181] Human activity recognition and behavior analysis Application areas: Behavioral biometrics, content-based video analysis, security and surveillance, interactive applications, animation and synthesis; Object detection methods: Motion segmentation methods (background subtraction based, statistical, temporal differencing and optical flow-based) and object classification; Object tracking methods (region, contour, feature, model, hybrid and optical flow-based); Action recognition techniques: Hierarchical (statistical, syntactic and description based) and non-hierarchical approaches; Human behavior understanding: Supervised, semi-supervised and unsupervised models; Dataset description: Controlled and realistic environments and its realistic impact on video-based surveillance market.
Borges (2013) [25] Human behavior analysis Human detection methods: Appearance, motion and hybrid approaches; Action recognition approaches: Low-level and spatio-temporal interest points, mid and high-level, silhouettes features; Interaction recognition: One-to-one, group interactions, models; Datasets.
Liu (2013) [105] Intelligent video systems and analytics

Video systems: Architecture (distributed/centralized), quality diagnosis, system adaptability (configuration, calibration, capability and scalability) analysis, data management and transmission methods; Analytics: Object attributes, motion pattern recognition, event and behavior analysis; Analytic methods: Intelligence and cooperative aspects, multi-camera view selections, statistical and networked analysis, learning and classification, 3-D sensing; Applications areas: Management, traffic control, transportation, intelligent vehicles, health-care, life sciences, security and military.

Zablocki (2013) [213] Characteristics of intelligent video surveillance systems System classification: Object detection, tracking and movement analysis technologies; Anomaly detection, identification and warning/alarming systems; Vehicle detection, traffic and parking lot analysis systems; Object counting systems; Integrated camera view handling systems; Privacy preserving systems; Cloud-based systems.
Tian (2015) [175] Vehicle surveillance Dynamic and static attribute extraction: Appearance and motion-based detection, tracking, recognition (license plate, type, color and logo), networked tracking of vehicles; Behavior understanding: Single camera study, trajectory (clustering, modeling and retrieval) and networked multi-camera-based, interesting region discovery; Image acquisition: Traffic scene characteristics, imaging technologies; ITS service study: Illegal activity and anomaly detection, security monitoring, electronic toll collection, traffic flow analysis, transportation planning and road construction, environment impact assessment.
Patil (2016) [140] Video datasets for anomaly detection Dataset classification: Traffic, subway, panic driven, pedestrian, abnormal activity, campus, train, sea, crowd.
Datondji (2016) [41] Traffic monitoring at intersections Camera based classification: Mono vision, omni vision and stereo vision; Vehicle sensing: Methodologies and datasets; Challenges: Initialization and preprocessing, vehicle detection and tracking; Vehicle detection methods: Candidate localization, verification; Vehicle tracking: Representation and tracking approaches: Region, contour, feature and model-based; Vehicle tracking algorithms: Matching, Bayesian; Challenges for intersection; Monitoring systems: Monocular vision and omni-directional vision-based, in-vehicle monitoring; Vehicle tracking: Roadside monitoring systems, in-vehicle monitoring systems; Vehicle behavior analysis.
Li (2017) [101] Spatio-temporal interest point (STIP) detection algorithms STIPs algorithms; Detection challenges; Applications: Human activity detection, anomaly detection, video summarization and content based video retrieval.
Shirazi (2017) [158] Intersections analysis from safety perspective Vehicular behavior: Trajectories, vehicle speed, acceleration, turn recognition; Driver behavior: Turning intention, aggression, perception reaction time; Pedestrian behavior: Motion prediction, waiting time, walking speed, crossing speed, and choices; Safety assessment: Gap analysis, threat, risk, conflict, accident; Intersection safety systems: Driver assistance systems (driver perception enhancement, action suggestion and human driver interface, advanced vehicle motion control delegation), infrastructure-based systems (roadside warning systems, dilemma zone protection systems, decision support systems).
Ahmed (2018) [8] Trajectory-based analysis Trajectory analysis: Datasets, extraction, representation, applications; Clustering algorithms; Event detection: Methods and learning procedures; Localization of abnormal events: Methods and learning procedures; Video summarization and synopsis generation.
Lopez-Fuentes (2018) [110] Emergency management using computer vision

Emergency classification: Natural, human made (road accident, crowd related, weapon threat, drowning, injured person, falling person); Monitoring objective: Prevention, detection, response and understanding; Acquisition methods: Sensor location, sensor types, acquisition rate and sensor cost; Feature extraction algorithms: Color, shape and texture, temporal (wavelet, optical flow, background modeling and subtraction, tracking) and convolution features; Semantic information extraction using machine learning: Artificial neural networks, deep learning, support vector machines (SVMs), hidden markov models (HMMs), fuzzy logic.

Mabrouk (2018) [115] Abnormal behavior recognition Behavior representation; Anomalous behavior recognition methods: Modeling frameworks and classification methods, scene density and moving object interaction in crowded and uncrowded scenes; Performance evaluation: Datasets and metrics; Existing surveillance systems.
TABLE I: Surveys on computer vision-based methods in surveillance

Rest of the paper is organized as follows. First, the background and the terminologies used in the paper are introduced in Section II-A. Anomaly detection related visual scene learning methods are presented in Section II-B. Anomaly detection approaches and classification are elaborated in Section II-C. Features used for anomaly detection and application areas are presented in Sections II-D and II-E, respectively. A critical analysis of the existing methods followed by discussions on the challenges and future possibilities of anomaly detection are presented in Section III. We conclude the paper in Section IV.

Ii Computer Vision Guided Anomaly Detection Studies

Ii-a Background and Terminologies

Features are assumed as data in the present context and are represented in the form of feature descriptors. Data typically occupy a position in a multi-dimensional space depending on the feature descriptor length.

Anomalies are data patterns that do not conform to a well-defined notion of normal behavior [29]

. There has been other synonyms of anomalies such as outliers, novelty in various application areas 

[58]. In this paper, we use anomaly or outlier in the subsequent part.

Ii-A1 Anomaly Classification

Traditionally, anomalies are classified as

point anomalies [152, 96, 73], contextual anomalies [165, 210] and collective anomalies [192, 34]. Data correspond to point anomaly if they are far away from the usual distribution. For example, a non-moving car on a busy road can be termed as a point anomaly. Contextual anomalies correspond to data that may be termed normal in a different context. For example, in a slow moving traffic, if a biker rides faster as compared to others, we may term it as anomaly. Conversely, in a less dense road it may be a normal behavior. A group of data instances together may cause anomaly even though individually they may be normal. For example, a group of people dispersing within a short span of time can be termed as collective anomaly.

In the context of visual surveillance, it is common to see anomalies classified as local and global anomalies [57, 68, 139, 207, 138, 154]. Global anomalies can be present in a frame or a segment of the video without specifying where exactly it has happened[57, 68, 139]. Local anomalies usually happen within in a specific area of the scene, but may be missed by global anomaly detection algorithms [207, 138, 154]. Some methods can detect both global and local anomalies[190, 5, 34, 78, 222].

Ii-A2 Challenges and Scope of Study

The key challenges in anomaly detection are: (i) defining a representative normal region, (ii) boundaries between the normal and anomalous regions may not be crisp or well defined, (iii) the notion of anomaly is not same in all application contexts, (iv) limited availability of data for training and validation, (v) data is often noisy due to inaccurate sensing, and (vi) normal behavior evolves over time.

We have done this survey based on the studies conducted on videos captured through a static camera. Anomaly detection using multiple cameras include additional challenges and the frameworks can be completely different [12, 57].

Ii-B Learning Methods

Learning the normal behavior is not only relevant for anomaly detection, but also for diverse use cases. Pattern analysis [47], classification [129], prediction [125], density estimation [4], and behavior analysis [15] are a few amongst them.

Learning methods can be classified as supervised, unsupervised or semi-supervised. In supervised learning, the normal profile is built using labeled data [79, 74, 81, 159]. It is typically applied for classification and regression related applications. In unsupervised learning, normal profile is structured from the relationships between elements of the unlabeled dataset [166]

. Semi-supervised learning primarily uses unlabeled data with some supervision with a small amount of labeled data for specifying example classes known apriori 

[170, 106]. If learning happens through interactive labeling of data as and when the label info is available, such a learning is called active learning [179, 42, 109, 134]

. Such methods are used when unlabeled data are abundant and manual labeling is expensive. Reinforcement learning, a relatively new learning applied on computer vision, is an area of machine learning concerned with how software agents (discriminant and generator) ought to take actions in an environment so as to maximize some notion of cumulative reward 

[195, 191, 215]. Some of the important works are summarized in Table II.

Learning method Ref.
Supervised [163, 143, 161, 111, 37, 113, 59, 62, 79, 74, 81, 159, 220, 32, 92]
Unsupervised [50, 131, 2, 149, 72, 117, 107, 152, 166, 182, 203, 199]
Semi-supervised [185, 27, 114, 170, 106, 138]
TABLE II: Broad categorization based on learning methods

Learned models are not only been used in feature extraction, but also used in object detection [188], classification [82], activity recognition [130], segmentation [86], tracking [183], entity re-identification [102], object interaction analysis [209], anomaly detection [77], etc. Table III presents some important learning methods used in anomaly detection.

Learning Method Method Applied context
Supervised Hidden Markov Model (HMM) [17] A supervised statistical Markov model where the system modeled is assumed to be a Markov process with hidden states: Used for anomaly detection in [20, 189].
Support Vector Machine (SVM) [61] A representation of data points in space, mapped such that separate categories are divided by a clear separation between them: A special class of SVM, namely One class SVM (OCSVM) has been extensively for anomaly detection [157].
Gaussian Regression (GR) [147] A generic supervised learning method designed to solve regression and probabilistic classification problems: Used in [34, 153] for anomaly detection from videos.
Convolutional Neural Networks (CNN) [54] A class of deep neural networks, applied usually to analyze visual imagery: Due to its applicability in extracting semantic level features from the input, it has become popular in many applications including anomaly detection [68, 118].
Multiple Instance Learning (MIL) [13] A special learning framework which deals with uncertainty of instance labels: Instead of receiving a set of instances which are individually labeled, the learner receives a set of labeled bags, each containing many instances. If all the instances in it are negative, the bag may be labeled negative. If there is at least one positive instance, the bag is labeled positive. It has been used for anomaly detection in [207, 168].
Long short-term memory (LSTM) networks [63]

A special kind of recurrent neural network (RNN) used in time series applications: In 

[113, 112, 118, 166], it has been used for anomaly detection.
Fast Region-based-CNN (Fast R-CNN) [53] A higher variation of neural deep neural networks (DNN) that works efficiently in object classification over conventional CNNs: Used for anomaly detection in [62].
Unsupervised Latent Dirichlet Allocation (LDA) [23] A topic model using statistical analysis to retrieve underlying topic distribution of in documents: Used for modeling visual words of videos for anomaly detection [73].
Probabilistic Latent Semantic Analysis (pLSA) [64] A model for representing co-occurrence information under a probabilistic framework: Used in [84] for anomaly detection.
Hierarchical Dirichlet Process (HDP) [174] A nonparametric Bayesian approach, built based on LDA, to cluster data: Used in data modeling and anomaly detection [78].
Gaussian Mixture Model (GMM) [19]

A probabilistic model that assumes all the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters: Used for anomaly detection in 

[99, 200].
Density-based spatial clustering of applications with noise (DBSCAN) [48] A density based non-parametric clustering algorithm used extensively for modeling and learning data patterns: Used for anomaly detection in [145].
Fisher kernel method [142] A function to measure similarity of two objects on the basis of sets of measurements for each object and a statistical model: Used to obtain trajectory feature representation in [186].
Principal component analysis (PCA) [75] A statistical procedure of orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables: Used for dimensionality reduction in [187].
Particle Swarm Optimization [85] A population based stochastic optimization technique: Used in [77] to obtain optimized motion descriptor from a set of particles having individual motion characteristics.
Generative Adversarial networks (GAN) [55]

A class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks (generator and discriminator) contesting with each other in a zero-sum game framework: Used for anomaly detection in 

Hybrid HDP+HMM A hybrid model: Used for representing sub-trajectories in [207] for anomaly detection using MIL.
GAN-LSTM [92] A hybrid model: Fake frames required for adversarial learning used in [119] are generated using bidirectional Conv-LSTM [204].
CNN-LSTM [119] A hybrid model: Prediction-based anomaly detection with the help of CNN-LSTM.
TABLE III: Examples of learning methods used in anomaly detection

Ii-C Anomaly Detection Approaches

Anomaly detection approaches can be classified as depicted Fig. 3.

colour me out/.style=outer color=#1!75, inner color=#1!50, draw=darkgray, thick, blur shadow, rounded corners, rect/.append style=rectangle, rounded corners=2pt, dir tree switch/.style args=at #1for tree= edge=-Latex, font=, fit=rectangle, , where level=#1 for tree= folder, grow’=0, , delay=child anchor=north, , before typesetting nodes= for tree= content/.wrap value=##1, , if=isodd(n_children(”!r”)) for nodewalk/.wrap pgfmath arg=fake=r,n=##1calign with current edgeint((n_children(”!r”)+1)/2), , , ,

dir tree switch=at 1, for tree= font=, rect, align=center, edge+=thick, draw=darkgray, where level=0colour me out=green!50!white, if level=1colour me out=magenta!50!orange!75!white, edge+=-Triangle, colour me out=magenta!50!orange!75!white, edge+=-Triangle, , ,

[Approaches [Model [Statistical [Parametric ] [Non-parametric ] ] ] [Proximity-based [Relative density] [Distance] ] [Classification [SVM] [Bayesian] [Rule-based] ] [Reconstruction [Sparse] [PCA] [Autoencoder] ] [Prediction ] [Others [Cluster] [Fuzzy] [Heuristic] [Hybrid] ] ] ]

Fig. 3: Classification of the anomaly detection methods based on different approaches.

Ii-C1 Model-based

Model-based approaches learn the normal behavior of data by representing them in terms of a set of parameters. Statistical approaches are used in general to learn the parameters of the model as they try to fit the data into a stochastic model. Statistical approaches may be either parametric or non-parametric. Parametric methods assume that the normal data is generated through parametric distribution and probability density function. Examples are Gaussian mixture models 

[99], Regression models [34], etc. In nonparametric statistical models, the structure is not defined apriori, instead determined dynamically from the data. Examples are histogram-based [216], Dirichlet process mixture models (DPMM) [131], Bayesian network-based models [22]

, etc. Bayesian network estimates the posterior probability of observing a class label from a set of normal class labels and the anomaly class labels, given a test data instance. The class label with the biggest posterior is regarded as predicted class for the given test instance. Typically, topic model-based anomaly detection methods use Bayesian nonparametric approaches 

[126, 84]

. DNN-based models can also be categorized under parametric models, where the parameters are the weights and biases of the neural networks 

[154, 28, 112]. However, some researchers consider them as a classification approaches [97]

, while many approaches (statistical, classification, information theoretic, reconstruction based) are used in the anomaly detection. Neural network-based methods also adopt information theoretic approach to reduce cross entropy between expected and the predicted outputs in the model learning 

[87]. Hence, it may be also categorized under hybrid approaches.

Ii-C2 Proximity-based

In proximity based approaches, anomalies are decided by how close they are to their neighbors. In distance-based approaches, the assumption is that normal data have dense neighborhood [38]. Density-based approaches compare the density around a point with the density around its local neighbors. The relative density of a point compared to its neighbors is computed as an outlier score [107].

Ii-C3 Classification-based

Classification based anomaly detection methods assume that a classifier can distinguish between normal and anomalous classes in a given feature space. Class-based anomaly detection techniques can be divided into two categories: one class and multi-class. Multi-class classification-based anomaly detection techniques assume that the training data contain labeled instances of normal and anomalous classes. A data point is assumed anomalous if it falls in the anomalous class [32]. One-class classification (OCC)-based anomaly detection techniques assume that all training data have one label [190, 192, 139, 205]. Such techniques learn a discriminative boundary around the normal instances using a one-class classification algorithm. Support Vector Machines (SVMs) can be used for anomaly detection in the one-class setting extensively in visual surveillance [29, 139]. Rule-based approaches learn rules that capture the normal behavior of a system [156]. A test instance that is not covered by any such rule, is considered as an anomaly.

Ii-C4 Prediction-based

Prediction-based approaches detect anomaly by calculating the variation between predicted and actual spatio-temporal characteristics of the feature descriptor [108]. HMM and LSTM models rely on such approaches for anomaly detection [20, 118, 119].

Ii-C5 Reconstruction-based

In reconstruction-based techniques, the assumption is, normal data can be embedded into a lower dimensional subspace in which normal instances and anomalies appear differently. Anomaly is measured based on the data reconstruction error. Some of the examples are, sparse coding [172, 218, 208], autoencoder [59], and principal component analysis (PCA)-based approaches [107].

Ii-C6 Other Approaches

There are two types of clustering approaches. One relies on an assumption that the normal data lie in a cluster, while anomaly data do not get associated with any cluster [145]. The later type is based on an assumption that normal data instances belongs to big and dense clusters, while anomalies either belong to little/small clusters. Fuzzy inference systems take a fuzzy data point and uses the rules related to membership and strength at which data point fires the rules to decide whether the data is anomalous or not [201, 98]. Heuristic methods intuitively decide about the feature values, spatial location, and contextual information to decide on anomalies. However, many practical systems do not entirely depend on one technology, rather hybrid approaches are used for anomaly detection [187, 33, 123]. Table IV presents the aforementioned categorization.

Specific Techniques Ref.
SVM [163, 143, 2, 190, 70, 16, 160]
Sparse [208, 21, 111, 113, 185, 149]
PCA [96]
Autoenoder [161, 59, 37, 150]
Regression [187, 153]
Density-based [50, 114]
Clustering-based [51, 131, 179]
Statistical methods [72, 35, 99]
Prediction [20, 118, 88, 10]
Bayesian Networks [71]
Fuzzy logic-based [201, 98]
Hybrid [207, 150, 107, 206, 123, 62, 177, 5, 66]
Relative Density [107]
Heuristic [199, 30, 211, 93, 167, 117, 69]
TABLE IV: Specific categorization of anomaly detection techniques

Ii-D Features Used in Anomaly Detection

Ref. Features Learning Anomaly Criteria Highlight
Yang (2013) [207] Sub-trajectories Multi instance learning Nearest neighborhood based approach with Hausdorff distance-based threshold for anomaly detection. Sub-trajectories-based local anomaly detection capability.
Roshtkhari (2013) [152] 3D Spatio-temporal volume Code-book model Threshold applied on likelihood/saliency map. Fast anomaly localization requiring less training data. Does not require any feature analysis, background/foreground segmentation and tracking, and can be applied for real-time applications.
Li (2014) [96] MDTs from Spatio-temporal patches Dynamic Texture Model Threshold on negative log-likelihood on temporal mixture of dynamic textures for temporal anomaly and threshold on the saliency for spatial anomalies. Detection of both temporal and spatial anomaly detection capability complex crowded scene.
Kaltsa (2014) [77] HOSA+HOGs over image patches SVM OCSVM based anomaly detection. Robustness to local noise and anomaly detection detection in crowded scene.
 Jeong (2014)[73] Trajectories and pixel velocities Hybrid (LDA + GMM) Threshold on the probability score. Thorough study conducted on at intersections and roads for traffic pattern analysis.
Zhu (2014) [222] Histogram of optical flow features Sparse coding Threshold on reconstruction cost used as anomaly measure. The method can detect both local and global anomalies. Experiments though not conducted on traffic junctions though could be suitable for busy junctions.
Kaltsa (2015) [76] Hybrid (HOS + HOG + PSO) SVM Support Vector Data Description (SVDD) method [173] for anomaly detection. Swarm intelligence is exploited for the extraction of robust motion and appearance features to model and to detect anomalies.
Maousavi (2015) [126] Histogram of Oriented Tracklets (HOT) LDA Log-likelihood based fixed threshold of visual words for anomaly detection. Comprehensive evaluation using topic model based anomaly detection and localization for a wide range of real-world videos.
Cheng (2015) [34] Spatio-temporal interest points (STIPs) [43] Gaussian regression Local anomalies: k-NN-based likelihood threshold with respect to the visual vocabulary of STIP codebook. Global anomalies: Using global negative log likelihood threshold. STIPS effectively used for local and global anomaly detection.
Mendel (2016) [118] Automatic videos features with CNN. Conv-LSTM Reconstruction error between predicted and actual output. Effective for recognizing abnormalities when the training data is loosely supervised to contain mostly normal events.
Zhang (2016) [217] Histogram of optical flow Clustering Anomaly score based on Hamming distance. Locality sensitive hashing filters used in anomaly detection.
Lan (2016) [91] HOG Heuristic method Anomalies detected using relative speeds of detected objects. An interesting study about abandoned objects that could possibly cause traffic accidents or some other untoward incidents.
Hasan (2016) [59] Handcrafted HOG+HOF [184] and automatic CNN extracted features Dual Autoencoder model Anomaly score, namely regularity score derived using reconstruction error in autoencoders. A regularity score, used as a measure of normalcy in a scene, derived using both hand crafted features and automatic features using fully convolutional feed-forward autoencoder.
Hinami (2017)[62] Deep features from CNN Multi-test Fast R-CNN. Anomaly detection with a combination of semantic features using (a)Nearest neighbor-based method (NN), (b)OCSVM and (c) KDE. It addresses the problem of joint detection and recounting of abnormal events in videos in presence of false alarms.
Wen (2017) [200] Object (velocity and direction) GMM Model based anomaly detection. Speeding events detection that could be relevant on road, though authors have tested the method for indoor scenarios.
Ravanbakhsh (2017) [148] Opticalflow frames + Normal frames GAN Anomaly score as a fusion of Optical-flow and appearance reconstruction error. Global and Local anomaly detection in crowded scene.
Lin (2017) [104] 3D-Tube SVM Contextual information embedded in trajectory thermal transfer fields using OCSVM. This is first kind of anomaly detection done using thermal fields that can detect contextual anomalies.
Liu (2017) [108] Automatically extracted optical flow, intensity and gradient features. GAN Peak Signal to Noise Ratio (PSNR) score based on optical flow, intensity, gradient loss. DNN-based prediction ([151]) and GAN [3] based discriminator applied on optical flow frames derived using ([44]) to detect robustness to the uncertainty in normal events and the sensitivity to abnormal events.
Colque (2017) [38] HOFME Histogram based model Nearest Neighbor threshold. A new feature descriptor HOFME that could handle diverse anomaly scenarios as compared with conventional features.
Giannakeris (2018) [52] Trajectory Fisher vector SVM Anomaly score derived from the Fisher vector using OCSVM. Anomaly detection done using robust optical flow descriptors of the detected vehicles with the use of DNNs and Fisher vector representations from spatiotemporal visual volumes.
Lee (2018) [92] Real and Fake frames GAN Abnormality score derived using the losses of the generator and the discriminator. Can detect anomalies from dataset containing complex motion and frequent occlusions.
Kalta (2018) [78] Code words of spatio-temporal regions Multiple HDPs Confidence score of reconstruction of region clips. Both local and global anomaly detection using super-pixels and interest point tracking [6] applied on real-life videos.
Sultani (2018)[168] Video clips Deep MIL Ranking Model An anomaly score using sparsity and smoothness constraints. A generic method applied on a variety of real-life scenarios.
TABLE V: Representative work based on used features
Ref. Technique Scene Anomalies Datasets
Yang (2013) [207] Multi instance learning Lobby. One person walking, browsing, resting, slumping or fainting, leaving bags behind, people/groups meeting, walking together and then splitting up and two people fighting. CAVIAR.
Roshtkhari (2013) [152] Code-book (Sparse) model Subway, walkway. Abnormal walking patterns, crawling, jumping over objects, falling down, non-pedestrians on a walkway, walking in the wrong direction, irregular interactions between people and some other events including sudden stopping, running fast, walking in the wrong direction and loitering. UCSC (Ped1, Ped2), Bellview and Person.
Jeong (2014) [73] LDA + GMM Junctions, walkway, roads, public gathering area. Illegal U-turn, vehicle in opposite direction, disordering in the the traffic signal, over speed on a pavement, unusual crowds speed, a car stops on a railway. UCSC, UMN, MIT, QMUL and In-house datasets.
Li (2014) [96] Dynamic Texture model Walkways, junction. Non-pedestrian entities in the walkways, people walking across a walkway or in the surrounding grass, U-turn. UCSD (Ped1, Ped2), U-turn and UMN.
Mo (2014) [123] Sparsity Model + OCSVM Junction, road, parking lot. Man suddenly falls on floor, vehicle almost hits a pedestrian, car violates the stop sign rule, car fails to yield to oncoming car while turning left, driver backs his car in front of stop sign. i-LIDS, CAVIAR and In-house dataset namely XEROX.
Patino (2014) [141] Statistical with heuristic approach Parking lot, road intersection. Unusual object trajectories such as U-turn, vehicle stopping at pedestrian way, person stopping between two lanes outside zebra passages, person crossing lanes outside zebra passages, loitering and vehicle/person staying at a place for longer duration. ARENA, CAVIAR and MIT trajectory dataset.
Akos (2014) [10] Hybrid (HMM + SVM + k-NN) Intersection. Collision, nearby passes. NGSIM and AIRS.
Wang (2014) [192] OCSVM Walkway, public gathering place. Local dispersion of crowds. PETS2009 and UMN.
Yun (2014) [211] Motion interaction field (MIF) symmetry model Junction. Accident detection. Car accident.
Xia (2015) [202] Low rank approximation on motion matrix created using optical flows. Road, intersection. Accident detection. In-house dataset.
Cheng (2015) [34] Gaussian regression Road, walkways, subway, intersection. Non pedestrians appearing in walkway, chase, fight, run together, traffic interruption, jaywalk, illegal u-turn, strange driving. UCSD (Ped1), Behave and QMUL.
Xu (2015) [2] Hybrid (DNN + Autoencoder + OCSVM) Walkways. Non pedestrians appearing in walkway. UCSD(Ped1, Ped2).
Kaviani(2015) [84] Hybrid (LDA+STC+pLSA+FSTM) Roadways, Junctions. Accident detection. QMUL and In-house datasets.
Nguyen (2015) [134] Bayesiean non-parametric Junctions. Street fight, loitering, truck-unusual stopping, big truck blocking camera. MIT.
Pathak (2015)[138] pLSA Junction, highway, roadways. Car stops after the stop-line, jaywalk, vehicle abruptly crossing the road. ldiap, highway (In-house) and i-LIDS.
Medel (2015) [119] ConvLSTM Walkways, roadways. People walking perpendicular. to the walkway, or off the walkway, movement of non-pedestrian entities and anomalous pedestrian motions, pedestrians walking off the walkway. USCD (Ped1, Ped2) and Avenue.
Zhou (2016) [220] CNN Junction, walkways, dispersing crowd. U-turn, unexpected presence of vehicles. UCSD, UMN, and U-turn.
Zhang (2016) [216] Hybrid (Histogram of Optical flow and Support Vector Data Description) Walkways. Non pedestrians on walkways. UCSD ped1.
Xu (2017) [205] OCSVM with SDAE features Walkways. Non pedestrians on walkways. UCSD.
Vishnu (2017) [180] Hybrid (MLR+DNN+vehiclecount) Highway, Roadway, Junction. Congestion detection, ambulance detection, accident detection. In-house datasets.
Liu (2017) [108] Heuristic Roadways, walkways, junction. Throwing objects, loitering and running, non pedestrians on walkways, presence of people at unexpected area of road. Avenue, UCSD Ped1, UCSD Ped2 and ShanghaiTech.
Giannakeris (2018) [52] SVM Roadways. Car crashes, stalled vehicles. NVDIA CITY.
Chebiyyam (2017) [31] Heuristic using SVM and Region Association Graph Parking lot, walkways. Object encircling a particular regions, target switching between two or more regions for a sustained period of time. MIT Parking trajectory, Avenue and a Custom dataset.
Yun (2017) [212] Sparse learning using motion interaction field  [211] Junction, roadways, public gathering area. Car accidents, crowd riots, and uncontrolled fighting. BEHAVE, UMN and Car accident.
Wang (2018) [186] Sparse topic Model Junction, Roadways. Car deviating from normal Pattern, Conflicting patterns, Vehicle suddenly interrupting normal pattern, jaywalk, vehicle retrograde, pedestrian near collisions with vehicle. i-LIDS and QMUL.
Kalta (2018) [78] HDP Intersections. Jay walking, illegal U-turns, wrong vehicle direction, traffic break. QMUL, ldiap and U-turn.
Sultani (2018)[168] Deep MIL Ranking Model Intersection, roadways, walkways. Abuse, arrest, arson, assault, accident, burglary, fighting, robbery. UMN, UCSC (Ped1, Ped2), Avenue, Subway, BOSS, Ab normal Crowd, and a set of Local datasets.
TABLE VI: Representative work on scope of applied areas
Fig. 4: Overall classification of features used in anomaly detection.