Exact and efficient multivariate two-sample tests through generalized linear rank statistics

09/28/2022
by   Dan D. Erdmann-Pham, et al.
0

So-called linear rank statistics provide a means for distribution-free (even in finite samples), yet highly flexible, two-sample testing in the setting of univariate random variables. Their flexibility derives from a choice of weights that can be adapted to any given (simple) alternative hypothesis to achieve efficiency in case of correct specification of said alternative, while their non-parametric nature guarantees well-calibrated p-values even under misspecification. By drawing connections to (generalized) maximum likelihood estimation, and exploiting recent work on ranks in multiple dimensions, we extend linear rank statistics both to multivariate random variables and composite alternatives. Doing so yields non-parametric, multivariate two-sample tests that mirror efficiency properties of likelihood ratio tests, while remaining robust against model misspecification. We prove non-parametric versions of the classical Wald and score tests facilitating hypothesis testing in the asymptotic regime, and relate these generalized linear rank statistics to linear spacing statistics enabling exact p-value computations in the small to moderate sample setting. Moreover, viewing rank statistics through the lens of likelihood ratios affords applications beyond fully efficient two-sample testing, of which we demonstrate three: testing in the presence of nuisance alternatives, simultaneous detection of location and scale shifts, and K-sample testing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2021

Some parametric tests based on sample spacings

Assume that we have a random sample from an absolutely continuous distri...
research
09/28/2021

The AUGUST Two-Sample Test: Powerful, Interpretable, and Fast

Two-sample testing is a fundamental problem in statistics, and many famo...
research
09/08/2015

On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests

Nonparametric two sample or homogeneity testing is a decision theoretic ...
research
08/15/2020

Generalized Spacing-Statistics and a New Family of Non-Parametric Tests

Random divisions of an interval arise in various context, including stat...
research
02/23/2021

A robust multivariate linear non-parametric maximum likelihood model for ties

Statistical analysis in applied research, across almost every field (e.g...
research
02/07/2023

A Bipartite Ranking Approach to the Two-Sample Problem

The two-sample problem, which consists in testing whether independent sa...
research
08/20/2020

Exact Tests for Offline Changepoint Detection in Multichannel Binary and Count Data with Application to Networks

We consider offline detection of a single changepoint in binary and coun...

Please sign up or login with your details

Forgot password? Click here to reset