Testing for Outliers with Conformal p-values

04/16/2021
by   Stephen Bates, et al.
10

This paper studies the construction of p-values for nonparametric outlier detection, taking a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually dependent for different test points. We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense. We then introduce a new method to compute p-values that are both valid conditionally on the training data and independent of each other for different test points; this paves the way to stronger type-I error guarantees. Our results depart from classical conformal inference as we leverage concentration inequalities rather than combinatorial arguments to establish our finite-sample guarantees. Furthermore, our techniques also yield a uniform confidence bound for the false positive rate of any outlier detection algorithm, as a function of the threshold applied to its raw statistics. Finally, the relevance of our results is demonstrated by numerical experiments on real and simulated data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/23/2022

Integrative conformal p-values for powerful out-of-distribution testing with labeled outliers

This paper develops novel conformal methods to test whether a new observ...
research
10/03/2021

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

We introduce Learn then Test, a framework for calibrating machine learni...
research
08/10/2023

Rank tests for outlier detection

In novelty detection, the objective is to determine whether the test sam...
research
12/13/2017

Multiple testing for outlier detection in functional data

We propose a novel procedure for outlier detection in functional data, i...
research
01/23/2022

Elementary proofs of four standard results on false discovery rate

We collect self-contained elementary proofs of four standard results in ...
research
07/18/2023

Model-free selective inference under covariate shift via weighted conformal p-values

This paper introduces weighted conformal p-values for model-free selecti...
research
06/14/2019

A/B Testing Measurement Framework for Recommendation Models Based on Expected Revenue

We provide a method to determine whether a new recommendation system imp...

Please sign up or login with your details

Forgot password? Click here to reset