What is Outlier Detection?
How does Outlier Detection work?
Outlier detection works by observing a data set and defining various points as outliers. There are several methods for defining outliers, and a popular method is through z-score analysis. The z-score is a value that represents the number of standard deviations that a data point is away from the mean. Particularly when dealing with parametric distributions in a low dimensional space, the a z-score threshold can help filter outliers from a data set.
Outlier Detection vs. Novelty Detection
In terms of anomaly detection, both outlier detection and novelty detection seem very similar. However, the two methods define different forms of anomalies. In simple terms, outlier detection can be thought as unsupervised learning, and novelty detection represents semi-supervised learning. A method of novelty detection is cluster analysis, a technique that outlier detection can never use. By definition, outliers are not located near any other populated area of data points. Should a cluster of points arise, the mean would adjust, and would no longer classify as outliers.