Explainable outlier detection through decision tree conditioning

01/02/2020 ∙ by David Cortes, et al. ∙ 14

This work describes an outlier detection procedure (named "OutlierTree") loosely based on the GritBot software developed by RuleQuest research, which works by evaluating and following supervised decision tree splits on variables, in whose branches 1-d confidence intervals are constructed for the target variable and potential outliers flagged according to these confidence intervals. Under this logic, it's possible to produce human-readable explanations for why a given value of a variable in an observation can be considered as outlier, by considering the decision tree branch conditions along with general distribution statistics among the non-outlier observations that fell into the same branch, which can then be contrasted against the value which lies outside the CI. The supervised splits help to ensure that the generated conditions are not spurious, but rather related to the target variable and having logical breakpoints.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

Code Repositories

outliertree

(Python, R, C++) Explainable outlier/anomaly detection through decision tree conditioning


view repo

outliertree

Explainable outlier/anomaly detection for Ruby


view repo

bagged.outliertrees

Explainable Unsupervised Outlier Detection Method Based on the OutlierTree Procedure (Cortes, 2020)


view repo

bagged.outliertrees

:exclamation: This is a read-only mirror of the CRAN R package repository. bagged.outliertrees — Robust Explainable Outlier Detection Based on OutlierTree. Homepage: https://github.com/RafaJPSantos/bagged.outliertrees Report bugs for this package: https://github.com/RafaJPSantos/bagged.outliertrees/issues


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.