Approximate Calculation of Tukey's Depth and Median With High-dimensional Data

12/07/2018
by   Milica Bogićević, et al.
0

We present a new fast approximate algorithm for Tukey (halfspace) depth level sets and its implementation-ABCDepth. Given a d-dimensional data set for any d≥ 1, the algorithm is based on a representation of level sets as intersections of balls in R^d. Our approach does not need calculations of projections of sample points to directions. This novel idea enables calculations of approximate level sets in very high dimensions with complexity which is linear in d, which provides a great advantage over all other approximate algorithms. Using different versions of this algorithm we demonstrate approximate calculations of the deepest set of points ("Tukey median") and Tukey's depth of a sample point or out-of-sample point, all with a linear in d complexity. An additional theoretical advantage of this approach is that the data points are not assumed to be in "general position". Examples with real and synthetic data show that the executing time of the algorithm in all mentioned versions in high dimensions is much smaller than the time of other implemented algorithms. Also, our algorithms can be used with thousands of multidimensional observations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2022

Another look at halfspace depth: Flag halfspaces with applications

The halfspace depth is a well studied tool of nonparametric statistics i...
research
07/10/2018

A GPU-Oriented Algorithm Design for Secant-Based Dimensionality Reduction

Dimensionality-reduction techniques are a fundamental tool for extractin...
research
06/25/2021

Accelerated Computation of a High Dimensional Kolmogorov-Smirnov Distance

Statistical testing is widespread and critical for a variety of scientif...
research
05/19/2011

Hierarchical Recursive Running Median

To date, the histogram-based running median filter of Perreault and Hébe...
research
10/24/2018

Extending the centerpoint theorem to multiple points

The centerpoint theorem is a well-known and widely used result in discre...
research
11/05/2017

Practical Data-Dependent Metric Compression with Provable Guarantees

We introduce a new distance-preserving compact representation of multi-d...
research
11/08/2022

Adaptive Data Depth via Multi-Armed Bandits

Data depth, introduced by Tukey (1975), is an important tool in data sci...

Please sign up or login with your details

Forgot password? Click here to reset