Computing Approximate Statistical Discrepancy

04/30/2018
by   Michael Matheny, et al.
0

Consider a geometric range space (X,A̧) where each data point x ∈ X has two or more values (say r(x) and b(x)). Also consider a function Φ(A) defined on any subset A ∈ (X,A̧) on the sum of values in that range e.g., r_A = ∑_x ∈ A r(x) and b_A = ∑_x ∈ A b(x). The Φ-maximum range is A^* = _A ∈ (X,A̧)Φ(A). Our goal is to find some  such that |Φ(Â) - Φ(A^*)| ≤ε. We develop algorithms for this problem for range spaces with bounded VC-dimension, as well as significant improvements for those defined by balls, halfspaces, and axis-aligned rectangles. This problem has many applications in many areas including discrepancy evaluation, classification, and spatial scan statistics.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset