Scalable Hash-Based Estimation of Divergence Measures

01/01/2018
by   Morteza Noshad, et al.
0

We propose a scalable divergence estimation method based on hashing. Consider two continuous random variables X and Y whose densities have bounded support. We consider a particular locality sensitive random hashing, and consider the ratio of samples in each hash bin having non-zero numbers of Y samples. We prove that the weighted average of these ratios over all of the hash bins converges to f-divergences between the two samples sets. We show that the proposed estimator is optimal in terms of both MSE rate and computational complexity. We derive the MSE rates for two families of smooth functions; the Hölder smoothness class and differentiable functions. In particular, it is proved that if the density functions have bounded derivatives up to the order d/2, where d is the dimension of samples, the optimal parametric MSE rate of O(1/N) can be achieved. The computational complexity is shown to be O(N), which is optimal. To the best of our knowledge, this is the first empirical divergence estimator that has optimal computational complexity and achieves the optimal parametric MSE estimation rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2018

Scalable Mutual Information Estimation using Dependence Graphs

We propose a unified method for empirical non-parametric estimation of g...
research
02/17/2017

Direct Estimation of Information Divergence Using Nearest Neighbor Ratios

We propose a direct estimation method for Rényi and f-divergence measure...
research
12/07/2021

Bless and curse of smoothness and phase transitions in nonparametric regressions: a nonasymptotic perspective

When the regression function belongs to the standard smooth classes cons...
research
06/23/2011

Relative Density-Ratio Estimation for Robust Distribution Comparison

Divergence estimators based on direct approximation of density-ratios wi...
research
02/02/2019

Bandwidth Selection for the Wolverton-Wagner Estimator

For n independent random variables having the same Hölder continuous den...
research
10/27/2018

Analysis of KNN Information Estimators for Smooth Distributions

KSG mutual information estimator, which is based on the distances of eac...
research
01/05/2021

Methods for computing b-functions associated with μ-constant deformations – Case of inner modality 2 –

New methods for computing parametric local b-functions are introduced fo...

Please sign up or login with your details

Forgot password? Click here to reset