Global and Local Two-Sample Tests via Regression

12/21/2018
by   Ilmun Kim, et al.
0

Two-sample testing is a fundamental problem in statistics. Despite its long history, there has been renewed interest in this problem with the advent of high-dimensional and complex data. Specifically, in the machine learning literature, there have been recent methodological developments such as classification accuracy tests. The goal of this work is to present a regression approach to comparing multivariate distributions of complex data. Depending on the chosen regression model, our framework can efficiently handle different types of variables and various structures in the data, with competitive power under many practical scenarios. Whereas previous work has been largely limited to global tests which conceal much of the local information, our approach naturally leads to a local two-sample testing framework in which we identify local differences between multivariate distributions with statistical confidence. We demonstrate the efficacy of our approach both theoretically and empirically, under some well-known parametric and nonparametric regression methods. Our proposed methods are applied to simulated data as well as a challenging astronomy data set to assess their practical usefulness.

READ FULL TEXT

page 3

page 21

research
01/03/2023

Inspecting differences between multivariate distributions: graphical tool-kit and related tests

This article inspects whether a multivariate distribution is different f...
research
09/08/2015

On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests

Nonparametric two sample or homogeneity testing is a decision theoretic ...
research
06/14/2021

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Modern kernel-based two-sample tests have shown great success in disting...
research
11/30/2018

Practical methods for graph two-sample testing

Hypothesis testing for graphs has been an important tool in applied rese...
research
08/31/2020

Goodness-of-fit tests for parametric regression models with circular response

Testing procedures for assessing a parametric regression model with circ...
research
12/05/2022

Testing for Regression Heteroskedasticity with High-Dimensional Random Forests

Statistical inference for high-dimensional regression heteroskedasticity...
research
05/27/2023

Auditing Fairness by Betting

We provide practical, efficient, and nonparametric methods for auditing ...

Please sign up or login with your details

Forgot password? Click here to reset