Beta-trees: Multivariate histograms with confidence statements

08/02/2023
by   Guenther Walther, et al.
0

Multivariate histograms are difficult to construct due to the curse of dimensionality. Motivated by k-d trees in computer science, we show how to construct an efficient data-adaptive partition of Euclidean space that possesses the following two properties: With high confidence the distribution from which the data are generated is close to uniform on each rectangle of the partition; and despite the data-dependent construction we can give guaranteed finite sample simultaneous confidence intervals for the probabilities (and hence for the average densities) of each rectangle in the partition. This partition will automatically adapt to the sizes of the regions where the distribution is close to uniform. The methodology produces confidence intervals whose widths depend only on the probability content of the rectangles and not on the dimensionality of the space, thus avoiding the curse of dimensionality. Moreover, the widths essentially match the optimal widths in the univariate setting. The simultaneous validity of the confidence intervals allows to use this construction, which we call Beta-trees, for various data-analytic purposes. We illustrate this by using Beta-trees for visualizing data and for multivariate mode-hunting.

READ FULL TEXT

page 10

page 11

research
11/29/2021

Confidence regions for univariate and multivariate data using permutation tests

Confidence intervals are central to statistical inference. We devise a m...
research
01/20/2017

Multivariate Confidence Intervals

Confidence intervals are a popular way to visualize and analyze data dis...
research
04/10/2019

On the construction of confidence intervals for ratios of expectations

In econometrics, many parameters of interest can be written as ratios of...
research
12/12/2019

Calibrated model-based evidential clustering using bootstrapping

Evidential clustering is an approach to clustering in which cluster-memb...
research
06/18/2020

Distribution-free binary classification: prediction sets, confidence intervals and calibration

We study three notions of uncertainty quantification—calibration, confid...
research
08/31/2020

Precision for binary measurement methods and results under beta-binomial distributions

To handle typical problems from fields dealing with biological responses...
research
10/31/2019

Simultaneous Inference for Multiple Proportions: A Multivariate Beta-Binomial Model

In this work, the construction of an m-dimensional Beta distribution fro...

Please sign up or login with your details

Forgot password? Click here to reset