RISE: Rank in Similarity Graph Edge-Count Two-Sample Test

12/24/2021
by   Doudou Zhou, et al.
0

Two-sample hypothesis testing for high-dimensional data is ubiquitous nowadays. Rank-based tests are popular nonparametric methods for univariate data. However, they are difficult to extend to high-dimensional data. In this paper, we propose a new family of non-parametric two-sample testing procedure, Rank In Similarity graph Edge-count two-sample test (RISE). The new test statistic is constructed on a rank-weighted similarity graph, such as the k-nearest neighbor graph. As a result, RISE can also be applied to non-Euclidean data. Theoretically, we prove that, under some mild conditions, the new test statistic converges to the Chi-squared distribution under the permutation null distribution, enabling a fast type-I error control. RISE exhibits good power under a wide range of alternatives compared to existing methods, as shown in extensive simulations. The new test is illustrated on the New York City taxi data for comparing travel patterns in consecutive months and a brain network dataset in comparing male and female subjects.

READ FULL TEXT

page 17

page 19

research
07/23/2023

A Robust Framework for Graph-based Two-Sample Tests Using Weights

Graph-based tests are a class of non-parametric two-sample tests useful ...
research
05/27/2022

New graph-based multi-sample tests for high-dimensional and non-Euclidean data

Testing the equality in distributions of multiple samples is a common ta...
research
01/02/2021

Visual High Dimensional Hypothesis Testing

In exploratory data analysis of known classes of high dimensional data, ...
research
04/26/2023

Bootstrapped Edge Count Tests for Nonparametric Two-Sample Inference Under Heterogeneity

Nonparametric two-sample testing is a classical problem in inferential s...
research
03/05/2020

A Nearest-Neighbor Based Nonparametric Test for Viral Remodeling in Heterogeneous Single-Cell Proteomic Data

An important problem in contemporary immunology studies based on single-...
research
11/12/2017

Graph-Based Two-Sample Tests for Discrete Data

In the regime of two-sample comparison, tests based on a graph construct...
research
06/07/2022

RING-CPD: Asymptotic Distribution-free Change-point Detection for Multivariate and Non-Euclidean Data

Change-point detection (CPD) concerns detecting distributional changes i...

Please sign up or login with your details

Forgot password? Click here to reset