Direct estimation of density functionals using a polynomial basis

02/21/2017
by   Alan Wisler, et al.
0

A number of fundamental quantities in statistical signal processing and information theory can be expressed as integral functions of two probability density functions. Such quantities are called density functionals as they map density functions onto the real line. For example, information divergence functions measure the dissimilarity between two probability density functions and are useful in a number of applications. Typically, estimating these quantities requires complete knowledge of the underlying distribution followed by multi-dimensional integration. Existing methods make parametric assumptions about the data distribution or use non-parametric density estimation followed by high-dimensional integration. In this paper, we propose a new alternative. We introduce the concept of "data-driven basis functions" - functions of distributions whose value we can estimate given only samples from the underlying distributions without requiring distribution fitting or direct integration. We derive a new data-driven complete basis that is similar to the deterministic Bernstein polynomial basis and develop two methods for performing basis expansions of functionals of two distributions. We also show that the new basis set allows us to approximate functions of distributions as closely as desired. Finally, we evaluate the methodology by developing data driven estimators for the Kullback-Leibler divergences and the Hellinger distance and by constructing empirical estimates of tight bounds on the Bayes error rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2020

Data-driven aggregation in non-parametric density estimation on the real line

We study non-parametric estimation of an unknown density with support in...
research
09/23/2016

Estimating Probability Distributions using "Dirac" Kernels (via Rademacher-Walsh Polynomial Basis Functions)

In many applications (in particular information systems, such as pattern...
research
10/07/2019

Where to find needles in a haystack?

In many existing methods in multiple comparison, one starts with either ...
research
08/06/2014

Empirical non-parametric estimation of the Fisher Information

The Fisher information matrix (FIM) is a foundational concept in statist...
research
10/07/2022

Estimation of the Order of Non-Parametric Hidden Markov Models using the Singular Values of an Integral Operator

We are interested in assessing the order of a finite-state Hidden Markov...
research
04/03/2019

Creating new distributions using integration and summation by parts

Methods for generating new distributions from old can be thought of as t...
research
06/03/2019

Temporal Density Extrapolation using a Dynamic Basis Approach

Density estimation is a versatile technique underlying many data mining ...

Please sign up or login with your details

Forgot password? Click here to reset