Generalized Kernel Two-Sample Tests

11/12/2020
by   Hoseung Song, et al.
0

Kernel two-sample tests have been widely used for multivariate data in testing equal distribution. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space do not work well for some common alternatives when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic that makes use of an informative pattern under moderate and high dimensions and achieves substantial power improvements over existing kernel two-sample tests for a wide range of alternatives. We also propose alternative testing procedures that maintain high power with low computational cost, offering easy off-the-shelf tools for large datasets. We illustrate these new approaches through an analysis of the New York City taxi data.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/07/2021

A Fast and Effective Large-Scale Two-Sample Test Based on Kernels

Kernel two-sample tests have been widely used and the development of eff...
02/10/2021

An Optimal Witness Function for Two-Sample Testing

We propose data-dependent test statistics based on a one-dimensional wit...
03/02/2018

Robust Multivariate Nonparametric Tests via Projection-Pursuit

In this work, we generalize the Cramér-von Mises statistic via projectio...
09/24/2017

On the Optimality of Kernel-Embedding Based Goodness-of-Fit Tests

The reproducing kernel Hilbert space (RKHS) embedding of distributions o...
06/15/2015

Fast Two-Sample Testing with Analytic Representations of Probability Measures

We propose a class of nonparametric two-sample tests with a cost linear ...
04/10/2019

A Reproducing Kernel Hilbert Space log-rank test for the two-sample problem

Weighted log-rank tests are arguably the most widely used tests by pract...
07/02/2021

Generalized Multivariate Signs for Nonparametric Hypothesis Testing in High Dimensions

High-dimensional data, where the dimension of the feature space is much ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.