A witness function based construction of discriminative models using Hermite polynomials

01/10/2019
by   H. N. Mhaskar, et al.
0

In machine learning, we are given a dataset of the form {(x_j,y_j)}_j=1^M, drawn as i.i.d. samples from an unknown probability distribution μ; the marginal distribution for the x_j's being μ^*. We propose that rather than using a positive kernel such as the Gaussian for estimation of these measures, using a non-positive kernel that preserves a large number of moments of these measures yields an optimal approximation. We use multi-variate Hermite polynomials for this purpose, and prove optimal and local approximation results in a supremum norm in a probabilistic sense. Together with a permutation test developed with the same kernel, we prove that the kernel estimator serves as a `witness function' in classification problems. Thus, if the value of this estimator at a point x exceeds a certain threshold, then the point is reliably in a certain class. This approach can be used to modify pretrained algorithms, such as neural networks or nonlinear dimension reduction techniques, to identify in-class vs out-of-class regions for the purposes of generative models, classification uncertainty, or finding robust centroids. This fact is demonstrated in a number of real world data sets including MNIST, CIFAR10, Science News documents, and LaLonde data sets.

READ FULL TEXT

page 11

page 12

research
10/15/2017

Estimation of Squared-Loss Mutual Information from Positive and Unlabeled Data

Capturing input-output dependency is an important task in statistical da...
research
02/27/2019

Multiple Kernel Learning from U-Statistics of Empirical Measures in the Feature Space

We propose a novel data-driven method to learn multiple kernels in kerne...
research
03/08/2019

Kernel Based Estimation of Spectral Risk Measures

Spectral risk measures (SRMs) belongs to the family of coherent risk mea...
research
08/29/2022

Dimension Independent Data Sets Approximation and Applications to Classification

We revisit the classical kernel method of approximation/interpolation th...
research
04/15/2016

A short note on extension theorems and their connection to universal consistency in machine learning

Statistical machine learning plays an important role in modern statistic...
research
08/19/2020

ℓ_p-Norm Multiple Kernel One-Class Fisher Null-Space

The paper addresses the multiple kernel learning (MKL) problem for one-c...
research
12/29/2021

Total positivity in multivariate extremes

Positive dependence is present in many real world data sets and has appe...

Please sign up or login with your details

Forgot password? Click here to reset