Nonparametric Detection of Anomalous Data Streams

04/25/2014
by   Shaofeng Zou, et al.
0

A nonparametric anomalous hypothesis testing problem is investigated, in which there are totally n sequences with s anomalous sequences to be detected. Each typical sequence contains m independent and identically distributed (i.i.d.) samples drawn from a distribution p, whereas each anomalous sequence contains m i.i.d. samples drawn from a distribution q that is distinct from p. The distributions p and q are assumed to be unknown in advance. Distribution-free tests are constructed using maximum mean discrepancy as the metric, which is based on mean embeddings of distributions into a reproducing kernel Hilbert space. The probability of error is bounded as a function of the sample size m, the number s of anomalous sequences and the number n of sequences. It is then shown that with s known, the constructed test is exponentially consistent if m is greater than a constant factor of log n, for any p and q, whereas with s unknown, m should has an order strictly greater than log n. Furthermore, it is shown that no test can be consistent for arbitrary p and q if m is less than a constant factor of log n, thus the order-level optimality of the proposed test is established. Numerical results are provided to demonstrate that our tests outperform (or perform as well as) the tests based on other competitive approaches under various cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2016

Nonparametric Detection of Geometric Structures over Networks

Nonparametric detection of existence of an anomalous structure over a ne...
research
01/21/2017

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

The problem of universal outlying sequence detection is studied, where t...
research
04/01/2014

A Kernel-Based Nonparametric Test for Anomaly Detection over Line Networks

The nonparametric problem of detecting existence of an anomalous interva...
research
07/31/2018

K-medoids Clustering of Data Sequences with Composite Distributions

This paper studies clustering of data sequences using the k-medoids algo...
research
09/08/2020

Second-Order Asymptotically Optimal Universal Outlying Sequence Detection with Reject Option

Motivated by practical machine learning applications, we revisit the out...
research
04/09/2017

Strictly Proper Kernel Scoring Rules and Divergences with an Application to Kernel Two-Sample Hypothesis Testing

We study strictly proper scoring rules in the Reproducing Kernel Hilbert...
research
06/30/2022

Joint Sequential Detection and Isolation for Dependent Data Streams

The problem of joint sequential detection and isolation is considered in...

Please sign up or login with your details

Forgot password? Click here to reset