Sequential Predictive Two-Sample and Independence Testing

04/29/2023
by   Aleksandr Podkopaev, et al.
0

We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data while maintaining type I error control. We build upon the principle of (nonparametric) testing by betting, where a gambler places bets on future observations and their wealth measures evidence against the null hypothesis. While recently developed kernel-based betting strategies often work well on simple distributions, selecting a suitable kernel for high-dimensional or structured data, such as text and images, is often nontrivial. To address this drawback, we design prediction-based betting strategies that rely on the following fact: if a sequentially updated predictor starts to consistently determine (a) which distribution an instance is drawn from, or (b) whether an instance is drawn from the joint distribution or the product of the marginal distributions (the latter produced by external randomization), it provides evidence against the two-sample or independence nulls respectively. We empirically demonstrate the superiority of our tests over kernel-based approaches under structured settings. Our tests can be applied beyond the case of independent and identically distributed data, remaining valid and powerful even when the data distribution drifts over time.

READ FULL TEXT
research
05/23/2023

A Rank-Based Sequential Test of Independence

We consider the problem of independence testing for two univariate rando...
research
12/16/2021

Game-theoretic Formulations of Sequential Nonparametric One- and Two-Sample Tests

We study the problem of designing consistent sequential one- and two-sam...
research
10/01/2022

Model-Free Sequential Testing for Conditional Independence via Testing by Betting

This paper develops a model-free sequential test for conditional indepen...
research
09/05/2007

Using Data Compressors to Construct Rank Tests

Nonparametric rank tests for homogeneity and component independence are ...
research
04/12/2022

Anytime-valid sequential testing for elicitable functionals via supermartingales

We design sequential tests for a large class of nonparametric null hypot...
research
12/14/2022

Sequential Kernelized Independence Testing

Independence testing is a fundamental and classical statistical problem ...
research
01/23/2019

kd-switch: A Universal Online Predictor with an application to Sequential Two-Sample Testing

We propose a novel online predictor for discrete labels conditioned on m...

Please sign up or login with your details

Forgot password? Click here to reset