Explainable outlier detection through decision tree conditioning

01/02/2020
by   David Cortes, et al.
14

This work describes an outlier detection procedure (named "OutlierTree") loosely based on the GritBot software developed by RuleQuest research, which works by evaluating and following supervised decision tree splits on variables, in whose branches 1-d confidence intervals are constructed for the target variable and potential outliers flagged according to these confidence intervals. Under this logic, it's possible to produce human-readable explanations for why a given value of a variable in an observation can be considered as outlier, by considering the decision tree branch conditions along with general distribution statistics among the non-outlier observations that fell into the same branch, which can then be contrasted against the value which lies outside the CI. The supervised splits help to ensure that the generated conditions are not spurious, but rather related to the target variable and having logical breakpoints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2016

Confidence Decision Trees via Online and Active Learning for Streaming (BIG) Data

Decision tree classifiers are a widely used tool in data stream mining. ...
research
12/16/2017

NDT: Neual Decision Tree Towards Fully Functioned Neural Graph

Though traditional algorithms could be embedded into neural architecture...
research
03/11/2023

Interpretable Outlier Summarization

Outlier detection is critical in real applications to prevent financial ...
research
11/29/2017

Valid Inference Corrected for Outlier Removal

Ordinary least square (OLS) estimation of a linear regression model is w...
research
11/30/2021

Modelling hetegeneous treatment effects by quantitle local polynomial decision tree and forest

To further develop the statistical inference problem for heterogeneous t...
research
03/09/2018

Sequential Outlier Detection based on Incremental Decision Trees

We introduce an online outlier detection algorithm to detect outliers in...
research
10/18/2022

Multivariate outlier explanations using Shapley values and Mahalanobis distances

For the purpose of explaining multivariate outlyingness, it is shown tha...

Please sign up or login with your details

Forgot password? Click here to reset