New graph-based multi-sample tests for high-dimensional and non-Euclidean data

05/27/2022
by   Hoseung Song, et al.
0

Testing the equality in distributions of multiple samples is a common task in many fields. However, this problem for high-dimensional or non-Euclidean data has not been well explored. In this paper, we propose new nonparametric tests based on a similarity graph constructed on the pooled observations from multiple samples, and make use of both within-sample edges and between-sample edges, a straightforward but yet not explored idea. The new tests exhibit substantial power improvements over existing tests for a wide range of alternatives. We also study the asymptotic distributions of the test statistics, offering easy off-the-shelf tools for large datasets. The new tests are illustrated through an analysis of the age image dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2017

Graph-Based Two-Sample Tests for Discrete Data

In the regime of two-sample comparison, tests based on a graph construct...
research
11/12/2020

Generalized Kernel Two-Sample Tests

Kernel two-sample tests have been widely used for multivariate data in t...
research
07/03/2020

New Non-parametric Tests for Multivariate Paired Data

Paired data are common in many fields, such as medical diagnosis and lon...
research
06/26/2021

Hypothesis Testing for Two Sample Comparison of Network Data

Network data is a major object data type that has been widely collected ...
research
06/03/2022

New kernel-based change-point detection

Change-point analysis plays a significant role in various fields to reve...
research
12/24/2021

RISE: Rank in Similarity Graph Edge-Count Two-Sample Test

Two-sample hypothesis testing for high-dimensional data is ubiquitous no...
research
10/03/2018

Nonparametric High-dimensional K-sample Comparison

High-dimensional k-sample comparison is a common applied problem. We con...

Please sign up or login with your details

Forgot password? Click here to reset