Estimating Divergences in High Dimensions

12/08/2021
by   Loong Kuan Lee, et al.
0

The problem of estimating the divergence between 2 high dimensional distributions with limited samples is an important problem in various fields such as machine learning. Although previous methods perform well with moderate dimensional data, their accuracy starts to degrade in situations with 100s of binary variables. Therefore, we propose the use of decomposable models for estimating divergences in high dimensional data. These allow us to factorize the estimated density of the high-dimensional distribution into a product of lower dimensional functions. We conduct formal and experimental analyses to explore the properties of using decomposable models in the context of divergence estimation. To this end, we show empirically that estimating the Kullback-Leibler divergence using decomposable models from a maximum likelihood estimator outperforms existing methods for divergence estimation in situations where dimensionality is high and useful decomposable models can be learnt from the available data.

READ FULL TEXT
research
04/15/2023

Asymptotic Breakdown Point Analysis for a General Class of Minimum Divergence Estimators

Robust inference based on the minimization of statistical divergences ha...
research
12/05/2019

Context Aware Password Guessability via Multi-Dimensional Rank Estimation

Password strength estimators are used to help users avoid picking weak p...
research
05/20/2020

Consistent and Flexible Selectivity Estimation for High-dimensional Data

Selectivity estimation aims at estimating the number of database objects...
research
11/07/2008

Improved Estimation of High-dimensional Ising Models

We consider the problem of jointly estimating the parameters as well as ...
research
10/13/2021

Why Out-of-distribution Detection in CNNs Does Not Like Mahalanobis – and What to Use Instead

Convolutional neural networks applied for real-world classification task...
research
07/14/2020

Estimating Barycenters of Measures in High Dimensions

Barycentric averaging is a principled way of summarizing populations of ...
research
11/09/2022

Machine-Learned Exclusion Limits without Binning

Machine-Learned Likelihoods (MLL) is a method that, by combining modern ...

Please sign up or login with your details

Forgot password? Click here to reset