DeepAI AI Chat
Log In Sign Up

New graph-based multi-sample tests for high-dimensional and non-Euclidean data

by   Hoseung Song, et al.
University of California-Davis
Fred Hutchinson Cancer Research Center

Testing the equality in distributions of multiple samples is a common task in many fields. However, this problem for high-dimensional or non-Euclidean data has not been well explored. In this paper, we propose new nonparametric tests based on a similarity graph constructed on the pooled observations from multiple samples, and make use of both within-sample edges and between-sample edges, a straightforward but yet not explored idea. The new tests exhibit substantial power improvements over existing tests for a wide range of alternatives. We also study the asymptotic distributions of the test statistics, offering easy off-the-shelf tools for large datasets. The new tests are illustrated through an analysis of the age image dataset.


page 1

page 2

page 3

page 4


Graph-Based Two-Sample Tests for Discrete Data

In the regime of two-sample comparison, tests based on a graph construct...

Generalized Kernel Two-Sample Tests

Kernel two-sample tests have been widely used for multivariate data in t...

New Non-parametric Tests for Multivariate Paired Data

Paired data are common in many fields, such as medical diagnosis and lon...

Hypothesis Testing for Two Sample Comparison of Network Data

Network data is a major object data type that has been widely collected ...

New kernel-based change-point detection

Change-point analysis plays a significant role in various fields to reve...

RISE: Rank in Similarity Graph Edge-Count Two-Sample Test

Two-sample hypothesis testing for high-dimensional data is ubiquitous no...

Nonparametric High-dimensional K-sample Comparison

High-dimensional k-sample comparison is a common applied problem. We con...