Detecting and Classifying Outliers in Big Functional Data

12/16/2019
by   Oluwasegun Taiwo Ojo, et al.
0

This paper proposes two new outlier detection methods, which are useful for identifying different types of outliers in (big) functional data sets. The proposed methods are improvements to an existing method called Massive Unsupervised Outlier Detection (MUOD). MUOD identifies different types of outliers by computing for each curve three indices, all based on the simple concepts of linear correlation and regression, which measure outlyingness in terms of shape, magnitude and amplitude relative to the other curves. To improve the performance of MUOD, we present 'Semifast-MUOD', which uses a sample of the observations in the computation of the indices, and 'Fast-MUOD', a fast implementation which uses the component-wise median in the computation of the indices. The classical boxplot is used to separate the indices of the outliers from those of the typical observations. Performance evaluation of the proposed improvements using real and simulated data show significant improvements compared to MUOD, both in outlier detection and computational time. Further comparisons with some recent outlier detection methods for functional data also show superior or comparable outlier detection accuracy.

READ FULL TEXT

page 26

page 29

research
07/26/2022

Multivariate Functional Outlier Detection using the FastMUOD Indices

We present definitions and properties of the fast massive unsupervised o...
research
05/15/2019

Automated detection of business-relevant outliers in e-commerce conversion rate

We evaluate how modern outlier detection methods perform in identifying ...
research
05/23/2020

A New Algorithm using Component-wise Adaptive Trimming For Robust Mixture Regression

Mixture regression provides a statistical model for teasing out latent h...
research
12/15/2021

Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms and Theoretical Studies

Outliers widely occur in big-data applications and may severely affect s...
research
10/19/2019

Efficient Discovery of Meaningful Outlier Relationships

We propose PODS (Predictable Outliers in Data-trendS), a method that, gi...
research
10/22/2021

DeepAg: Deep Learning Approach for Measuring the Effects of Outlier Events on Agricultural Production and Policy

Quantitative metrics that measure the global economy's equilibrium have ...
research
07/02/2021

Depth-based Outlier Detection for Grouped Smart Meters: a Functional Data Analysis Toolbox

Smart metering infrastructures collect data almost continuously in the f...

Please sign up or login with your details

Forgot password? Click here to reset