Feature Shift Detection: Localizing Which Features Have Shifted via Conditional Distribution Tests

07/14/2021
by   Sean Kulinski, et al.
0

While previous distribution shift detection approaches can identify if a shift has occurred, these approaches cannot localize which specific features have caused a distribution shift – a critical step in diagnosing or fixing any underlying issue. For example, in military sensor networks, users will want to detect when one or more of the sensors has been compromised, and critically, they will want to know which specific sensors might be compromised. Thus, we first define a formalization of this problem as multiple conditional distribution hypothesis tests and propose both non-parametric and parametric statistical tests. For both efficiency and flexibility, we then propose to use a test statistic based on the density model score function (i.e. gradient with respect to the input) – which can easily compute test statistics for all dimensions in a single forward and backward pass. Any density model could be used for computing the necessary statistics including deep density models such as normalizing flows or autoregressive models. We additionally develop methods for identifying when and where a shift occurs in multivariate time-series data and show results for multiple scenarios using realistic attack models on both simulated and real world data.

READ FULL TEXT
research
09/28/2022

Using the Sinkhorn divergence in permutation tests for the multivariate two-sample problem

In order to adapt the Wasserstein distance to the large sample multivari...
research
07/14/2023

Two-Sample Test with Copula Entropy

In this paper we propose a two-sample test based on copula entropy (CE)....
research
02/24/2020

Testing Goodness of Fit of Conditional Density Models with Kernels

We propose two nonparametric statistical tests of goodness of fit for co...
research
01/26/2018

Correlated Components Analysis --- Extracting Reliable Dimensions in Multivariate Data

How does one find data dimensions that are reliably expressed across rep...
research
08/07/2023

DOMINO: Domain-invariant Hyperdimensional Classification for Multi-Sensor Time Series Data

With the rapid evolution of the Internet of Things, many real-world appl...
research
03/02/2022

Model-agnostic out-of-distribution detection using combined statistical tests

We present simple methods for out-of-distribution detection using a trai...
research
09/04/2017

Learning Implicit Generative Models Using Differentiable Graph Tests

Recently, there has been a growing interest in the problem of learning r...

Please sign up or login with your details

Forgot password? Click here to reset