The minimax risk in testing the histogram of discrete distributions for uniformity under missing ball alternatives

05/29/2023
by   Alon Kipnis, et al.
0

We consider the problem of testing the fit of a discrete sample of items from many categories to the uniform distribution over the categories. As a class of alternative hypotheses, we consider the removal of an ℓ_p ball of radius ϵ around the uniform rate sequence for p ≤ 2. We deliver a sharp characterization of the asymptotic minimax risk when ϵ→ 0 as the number of samples and number of dimensions go to infinity, for testing based on the occurrences' histogram (number of absent categories, singletons, collisions, ...). For example, for p=1 and in the limit of a small expected number of samples n compared to the number of categories N (aka "sub-linear" regime), the minimax risk R^*_ϵ asymptotes to 2 Φ̅(n ϵ^2/√(8N)), with Φ̅(x) the normal survival function. Empirical studies over a range of problem parameters show that this estimate is accurate in finite samples, and that our test is significantly better than the chisquared test or a test that only uses collisions. Our analysis is based on the asymptotic normality of histogram ordinates, the equivalence between the minimax setting to a Bayesian one, and the reduction of a multi-dimensional optimization problem to a one-dimensional problem.

READ FULL TEXT
research
03/01/2020

On Minimax Exponents of Sparse Testing

We consider exact asymptotics of the minimax risk for global testing aga...
research
02/01/2019

Local minimax rates for closeness testing of discrete distributions

We consider the closeness testing (or two-sample testing) problem in the...
research
06/30/2017

Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates

We consider the goodness-of-fit testing problem of distinguishing whethe...
research
10/16/2018

Optimal locally private estimation under ℓ_p loss for 1< p< 2

We consider the minimax estimation problem of a discrete distribution wi...
research
11/28/2018

Minimax Optimal Additive Functional Estimation with Discrete Distribution

This paper addresses a problem of estimating an additive functional give...
research
11/16/2021

Online Estimation and Optimization of Utility-Based Shortfall Risk

Utility-Based Shortfall Risk (UBSR) is a risk metric that is increasingl...
research
12/27/2015

Statistical and Computational Guarantees for the Baum-Welch Algorithm

The Hidden Markov Model (HMM) is one of the mainstays of statistical mod...

Please sign up or login with your details

Forgot password? Click here to reset