Near-Optimal Bounds for Testing Histogram Distributions

07/14/2022
by   Clément L. Canonne, et al.
4

We investigate the problem of testing whether a discrete probability distribution over an ordered domain is a histogram on a specified number of bins. One of the most common tools for the succinct approximation of data, k-histograms over [n], are probability distributions that are piecewise constant over a set of k intervals. The histogram testing problem is the following: Given samples from an unknown distribution 𝐩 on [n], we want to distinguish between the cases that 𝐩 is a k-histogram versus ε-far from any k-histogram, in total variation distance. Our main result is a sample near-optimal and computationally efficient algorithm for this testing problem, and a nearly-matching (within logarithmic factors) sample complexity lower bound. Specifically, we show that the histogram testing problem has sample complexity Θ(√(nk) / ε + k / ε^2 + √(n) / ε^2).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2018

Testing Identity of Multidimensional Histograms

We investigate the problem of identity testing for multidimensional hist...
research
04/13/2023

Near-Optimal Degree Testing for Bayes Nets

This paper considers the problem of testing the maximum in-degree of the...
research
11/29/2019

Location histogram privacy by sensitive location hiding and target histogram avoidance/resemblance (extended version)

A location histogram is comprised of the number of times a user has visi...
research
08/09/2020

Testing Determinantal Point Processes

Determinantal point processes (DPPs) are popular probabilistic models of...
research
02/23/2018

Fast and Sample Near-Optimal Algorithms for Learning Multidimensional Histograms

We study the problem of robustly learning multi-dimensional histograms. ...
research
05/04/2023

Testing Convex Truncation

We study the basic statistical problem of testing whether normally distr...
research
01/31/2019

Minimax Testing of Identity to a Reference Ergodic Markov Chain

We exhibit an efficient procedure for testing, based on a single long st...

Please sign up or login with your details

Forgot password? Click here to reset