Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges
A family R of ranges and a set X of points together define a range space (X, R|_X), where R|_X = {X ∩ h | h ∈R}. We want to find a structure to estimate the quantity |X ∩ h|/|X| for any range h ∈R with the (ρ, ϵ)-guarantee: (i) if |X ∩ h|/|X| > ρ, the estimate must have a relative error ϵ; (ii) otherwise, the estimate must have an absolute error ρϵ. The objective is to minimize the size of the structure. Currently, the dominant solution is to compute a relative (ρ, ϵ)-approximation, which is a subset of X with Õ(λ/(ρϵ^2)) points, where λ is the VC-dimension of (X, R|_X), and Õ hides polylog factors. This paper shows a more general bound sensitive to the content of X. We give a structure that stores O( (1/ρ)) integers plus Õ(θ· (λ/ϵ^2)) points of X, where θ - called the disagreement coefficient - measures how much the ranges differ from each other in their intersections with X. The value of θ is between 1 and 1/ρ, such that our space bound is never worse than that of relative (ρ, ϵ)-approximations, but we improve the latter's 1/ρ term whenever θ = o(1/ρ (1/ρ)). We also prove that, in the worst case, summaries with the (ρ, 1/2)-guarantee must consume Ω(θ) words even for d = 2 and λ< 3. We then constrain R to be the set of halfspaces in R^d for a constant d, and prove the existence of structures with o(1/(ρϵ^2)) size offering (ρ,ϵ)-guarantees, when X is generated from various stochastic distributions. This is the first formal justification on why the term 1/ρ is not compulsory for "realistic" inputs.
READ FULL TEXT