Detecting and Classifying Outliers in Big Functional Data

12/16/2019
by   Oluwasegun Taiwo Ojo, et al.
0

This paper proposes two new outlier detection methods, which are useful for identifying different types of outliers in (big) functional data sets. The proposed methods are improvements to an existing method called Massive Unsupervised Outlier Detection (MUOD). MUOD identifies different types of outliers by computing for each curve three indices, all based on the simple concepts of linear correlation and regression, which measure outlyingness in terms of shape, magnitude and amplitude relative to the other curves. To improve the performance of MUOD, we present 'Semifast-MUOD', which uses a sample of the observations in the computation of the indices, and 'Fast-MUOD', a fast implementation which uses the component-wise median in the computation of the indices. The classical boxplot is used to separate the indices of the outliers from those of the typical observations. Performance evaluation of the proposed improvements using real and simulated data show significant improvements compared to MUOD, both in outlier detection and computational time. Further comparisons with some recent outlier detection methods for functional data also show superior or comparable outlier detection accuracy.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset