On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests

09/08/2015
by   Aaditya Ramdas, et al.
0

Nonparametric two sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being intelligently designed and analyzed, both for the unidimensional and the multivariate setting. Our contribution is to tie together many of these tests, drawing connections between seemingly very different statistics. In this work, our central object is the Wasserstein distance, as we form a chain of connections from univariate methods like the Kolmogorov-Smirnov test, PP/QQ plots and ROC/ODC curves, to multivariate tests involving energy statistics and kernel based maximum mean discrepancy. Some connections proceed through the construction of a smoothed Wasserstein distance, and others through the pursuit of a "distribution-free" Wasserstein test. Some observations in this chain are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two sample testing's classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2015

Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing

Nonparametric two sample testing is a decision theoretic problem that in...
research
09/28/2022

Exact and efficient multivariate two-sample tests through generalized linear rank statistics

So-called linear rank statistics provide a means for distribution-free (...
research
12/21/2018

Global and Local Two-Sample Tests via Regression

Two-sample testing is a fundamental problem in statistics. Despite its l...
research
03/14/2020

Multivariate goodness-of-Fit tests based on Wasserstein distance

Goodness-of-fit tests based on the empirical Wasserstein distance are pr...
research
02/19/2019

Interpoint Distance Based Two Sample Tests in High Dimension

In this paper, we study a class of two sample test statistics based on i...
research
11/29/2020

A new approach to posterior contraction rates via Wasserstein dynamics

This paper presents a new approach to the classical problem of quantifyi...
research
03/22/2017

Testing and Learning on Distributions with Symmetric Noise Invariance

Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD...

Please sign up or login with your details

Forgot password? Click here to reset