Nonasymptotic one-and two-sample tests in high dimension with unknown covariance structure

09/01/2021
by   Gilles Blanchard, et al.
0

Let 𝐗 = (X_i)_1≤ i ≤ n be an i.i.d. sample of square-integrable variables in ℝ^d, with common expectation μ and covariance matrix Σ, both unknown. We consider the problem of testing if μ is η-close to zero, i.e. μ≤η against μ≥ (η + δ); we also tackle the more general two-sample mean closeness (also known as relevant difference) testing problem. The aim of this paper is to obtain nonasymptotic upper and lower bounds on the minimal separation distance δ such that we can control both the Type I and Type II errors at a given level. The main technical tools are concentration inequalities, first for a suitable estimator of μ^2 used a test statistic, and secondly for estimating the operator and Frobenius norms of Σ coming into the quantiles of said test statistic. These properties are obtained for Gaussian and bounded distributions. A particular attention is given to the dependence in the pseudo-dimension d_* of the distribution, defined as d_* := Σ_2^2/Σ_∞^2. In particular, for η=0, the minimum separation distance is Θ( d_*^1/4√(Σ_∞/n)), in contrast with the minimax estimation distance for μ, which is Θ(d_e^1/2√(Σ_∞/n)) (where d_e:=Σ_1/Σ_∞). This generalizes a phenomenon spelled out in particular by Baraud (2002).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2021

Optimal Linear Classification via Eigenvalue Shrinkage: The Case of Additive Noise

In this paper, we consider the general problem of testing the mean of tw...
research
06/05/2019

Estimating Feature-Label Dependence Using Gini Distance Statistics

Identifying statistical dependence between the features and the label is...
research
06/15/2023

A CLT for the difference of eigenvalue statistics of sample covariance matrices

In the case where the dimension of the data grows at the same rate as th...
research
01/21/2023

Statistically Optimal Robust Mean and Covariance Estimation for Anisotropic Gaussians

Assume that X_1, …, X_N is an ε-contaminated sample of N independent Gau...
research
07/20/2023

Non-asymptotic statistical test of the diffusion coefficient of stochastic differential equations

We develop several statistical tests of the determinant of the diffusion...
research
08/29/2020

Efficiency Loss of Asymptotically Efficient Tests in an Instrumental Variables Regression

In an instrumental variable model, the score statistic can be stochastic...
research
01/08/2019

What is the dimension of a stochastic process? Testing for the rank of a covariance operator

How can we discern whether a mean-square continuous stochastic process i...

Please sign up or login with your details

Forgot password? Click here to reset