New graph-based multi-sample tests for high-dimensional and non-Euclidean data

05/27/2022
by   Hoseung Song, et al.
0

Testing the equality in distributions of multiple samples is a common task in many fields. However, this problem for high-dimensional or non-Euclidean data has not been well explored. In this paper, we propose new nonparametric tests based on a similarity graph constructed on the pooled observations from multiple samples, and make use of both within-sample edges and between-sample edges, a straightforward but yet not explored idea. The new tests exhibit substantial power improvements over existing tests for a wide range of alternatives. We also study the asymptotic distributions of the test statistics, offering easy off-the-shelf tools for large datasets. The new tests are illustrated through an analysis of the age image dataset.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset