The Size of a t-Digest
A t-digest is a compact data structure that allows estimates of quantiles which increased accuracy near q = 0 or q=1. This is done by clustering samples from R subject to a constraint that the number of points associated with any particular centroid is constrained so that the so-called k-size of the centroid is always < 1. The k-size is defined using a scale function that maps quantile q to index k. This paper provides bounds on the sizes of t-digests created using any of four known scale functions.
READ FULL TEXT