Comparing a Large Number of Multivariate Distributions

04/11/2019
by   Ilmun Kim, et al.
0

In this paper, we propose a test for the equality of multiple distributions based on kernel mean embeddings. Our framework provides a flexible way to handle multivariate or even high-dimensional data by virtue of kernel methods and allows the number of distributions to increase with the sample size. This is in contrast to previous studies that have been mostly restricted to classical low-dimensional settings with a fixed number of distributions. By building on Cramer-type moderate deviation for degenerate two-sample V-statistics, we derive the limiting null distribution of the test statistic and show that it converges to a Gumbel distribution. The limiting distribution, however, depends on an infinite number of nuisance parameters, which makes it infeasible for use in practice. To address this issue, the proposed test is implemented via the permutation procedure and is shown to be minimax rate optimal against sparse alternatives. During our analysis, an exponential concentration inequality for the permuted test statistic is developed which may be of independent interest.

READ FULL TEXT
research
11/27/2022

A Permutation-free Kernel Two-Sample Test

The kernel Maximum Mean Discrepancy (MMD) is a popular multivariate dist...
research
03/13/2020

Two-Sample High Dimensional Mean Test Based On Prepivots

Testing equality of mean vectors is a very commonly used criterion when ...
research
03/20/2023

Dimension-agnostic Change Point Detection

Change point testing is a well-studied problem in statistics. Owing to t...
research
09/22/2019

Distribution-free consistent independence tests via Hallin's multivariate rank

This paper investigates the problem of testing independence of two rando...
research
12/27/2022

Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference F-type test

The problem of testing the equality of mean vectors for high-dimensional...
research
09/14/2017

Two-sample Statistics Based on Anisotropic Kernels

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) s...
research
09/07/2020

Anomaly Detection in Stationary Settings: A Permutation-Based Higher Criticism Approach

Anomaly detection when observing a large number of data streams is essen...

Please sign up or login with your details

Forgot password? Click here to reset