Interpoint Distance Based Two Sample Tests in High Dimension

02/19/2019
by   Changbo Zhu, et al.
0

In this paper, we study a class of two sample test statistics based on inter-point distances in the high dimensional and low sample size setting. Our test statistics include the well-known energy distance and maximum mean discrepancy with Gaussian and Laplacian kernels, and the critical values are obtained via permutations. We show that all these tests are inconsistent when the two high dimensional distributions correspond to the same marginal distributions but differ in other aspects of the distributions. The tests based on energy distance and maximum mean discrepancy are mainly targeting the differences between marginal means and variances, whereas the test based on L^1-distance can capture the difference in marginal distributions. Our theory sheds new light on the limitation of inter-point distance based tests, the impact of different distance metrics, and the behavior of permutation tests in high dimension. Some simulation results and a real data illustration are also presented to corroborate our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2019

A New Framework for Distance and Kernel-based Metrics in High Dimensions

The paper presents new metrics to quantify and test for (i) the equality...
research
12/31/2021

Kernel Two-Sample Tests in High Dimension: Interplay Between Moment Discrepancy and Dimension-and-Sample Orders

Motivated by the increasing use of kernel-based metrics for high-dimensi...
research
09/30/2021

Two Sample Testing in High Dimension via Maximum Mean Discrepancy

Maximum Mean Discrepancy (MMD) has been widely used in the areas of mach...
research
08/04/2015

Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing

Nonparametric two sample testing is a decision theoretic problem that in...
research
12/16/2022

On High Dimensional Behaviour of Some Two-Sample Tests Based on Ball Divergence

In this article, we propose some two-sample tests based on ball divergen...
research
09/08/2015

On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests

Nonparametric two sample or homogeneity testing is a decision theoretic ...
research
05/29/2020

The energy distance for ensemble and scenario reduction

Scenario reduction techniques are widely applied for solving sophisticat...

Please sign up or login with your details

Forgot password? Click here to reset