C-AllOut: Catching Calling Outliers by Type

10/13/2021
by   Guilherme D. F. Silva, et al.
0

Given an unlabeled dataset, wherein we have access only to pairwise similarities (or distances), how can we effectively (1) detect outliers, and (2) annotate/tag the outliers by type? Outlier detection has a large literature, yet we find a key gap in the field: to our knowledge, no existing work addresses the outlier annotation problem. Outliers are broadly classified into 3 types, representing distinct patterns that could be valuable to analysts: (a) global outliers are severe yet isolate cases that do not repeat, e.g., a data collection error; (b) local outliers diverge from their peers within a context, e.g., a particularly short basketball player; and (c) collective outliers are isolated micro-clusters that may indicate coalition or repetitions, e.g., frauds that exploit the same loophole. This paper presents C-AllOut: a novel and effective outlier detector that annotates outliers by type. It is parameter-free and scalable, besides working only with pairwise similarities (or distances) when it is needed. We show that C-AllOut achieves on par or significantly better performance than state-of-the-art detectors when spotting outliers regardless of their type. It is also highly effective in annotating outliers of particular types, a task that none of the baselines can perform.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2018

Detecting Outliers in Data with Correlated Measures

Advances in sensor technology have enabled the collection of large-scale...
research
08/20/2022

Evaluating Out-of-Distribution Detectors Through Adversarial Generation of Outliers

A reliable evaluation method is essential for building a robust out-of-d...
research
04/05/2019

Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

In a corpus of data, outliers are either errors: mistakes in the data th...
research
01/25/2015

Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels

The problem of estimating subjective visual properties from image and vi...
research
02/07/2018

Outlier Detection for Robust Multi-dimensional Scaling

Multi-dimensional scaling (MDS) plays a central role in data-exploration...
research
10/15/2022

D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments

How can we detect outliers, both scattered and clustered, and also expli...
research
11/02/2021

Informative Planning in the Presence of Outliers

Informative planning seeks a sequence of actions that guide the robot to...

Please sign up or login with your details

Forgot password? Click here to reset