The Size of a t-Digest

03/24/2019
by   Ted Dunning, et al.
0

A t-digest is a compact data structure that allows estimates of quantiles which increased accuracy near q = 0 or q=1. This is done by clustering samples from R subject to a constraint that the number of points associated with any particular centroid is constrained so that the so-called k-size of the centroid is always < 1. The k-size is defined using a scale function that maps quantile q to index k. This paper provides bounds on the sizes of t-digests created using any of four known scale functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2019

Conservation of the t-digest Scale Invariant

A t-digest is a compact data structure that allows estimates of quantile...
research
06/09/2021

Reachability Problems for Transmission Graphs

Let P be a set of n points in the plane where each point p of P is assoc...
research
11/30/2022

On Disperser/Lifting Properties of the Index and Inner-Product Functions

Query-to-communication lifting theorems, which connect the query complex...
research
01/10/2019

The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points

This paper studies an n-dimensional additive Gaussian noise channel with...
research
10/17/2017

The Bayesian Sorting Hat: A Decision-Theoretic Approach to Size-Constrained Clustering

Size-constrained clustering (SCC) refers to the dual problem of using ob...
research
05/23/2019

COBS: a Compact Bit-Sliced Signature Index

We present COBS, a compact bit-sliced signature index, which is a cross-...
research
01/29/2019

Simulating the DNA String Graph in Succinct Space

Converting a set of sequencing reads into a lossless compact data struct...

Please sign up or login with your details

Forgot password? Click here to reset