
Adaptivity and ComputationStatistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing
Nonparametric two sample testing is a decision theoretic problem that in...
read it

A LinearTime Kernel GoodnessofFit Test
We propose a novel adaptive test of goodnessoffit, with computational ...
read it

Testing Equivalence of Clustering
In this paper, we test whether two datasets share a common clustering st...
read it

Testing and Learning on Distributions with Symmetric Noise Invariance
Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD...
read it

Intermediate efficiency of tests under heavytailed alternatives
We show that for local alternatives which are not square integrable the ...
read it

Directing Power Towards SubAlternatives
This paper proposes a novel test statistic for testing a potentially hig...
read it

Classification Logit Twosample Testing by Neural Networks
The recent success of generative adversarial networks and variational le...
read it
On the Highdimensional Power of Lineartime Kernel TwoSample Testing under Meandifference Alternatives
Nonparametric two sample testing deals with the question of consistently deciding if two distributions are different, given samples from both, without making any parametric assumptions about the form of the distributions. The current literature is split into two kinds of tests  those which are consistent without any assumptions about how the distributions may differ (general alternatives), and those which are designed to specifically test easier alternatives, like a difference in means (meanshift alternatives). The main contribution of this paper is to explicitly characterize the power of a popular nonparametric two sample test, designed for general alternatives, under a meanshift alternative in the highdimensional setting. Specifically, we explicitly derive the power of the lineartime Maximum Mean Discrepancy statistic using the Gaussian kernel, where the dimension and sample size can both tend to infinity at any rate, and the two distributions differ in their means. As a corollary, we find that if the signaltonoise ratio is held constant, then the test's power goes to one if the number of samples increases faster than the dimension increases. This is the first explicit power derivation for a general nonparametric test in the highdimensional setting, and also the first analysis of how tests designed for general alternatives perform when faced with easier ones.
READ FULL TEXT
Comments
There are no comments yet.