The Size of a t-Digest

03/24/2019
by   Ted Dunning, et al.
0

A t-digest is a compact data structure that allows estimates of quantiles which increased accuracy near q = 0 or q=1. This is done by clustering samples from R subject to a constraint that the number of points associated with any particular centroid is constrained so that the so-called k-size of the centroid is always < 1. The k-size is defined using a scale function that maps quantile q to index k. This paper provides bounds on the sizes of t-digests created using any of four known scale functions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset