Optimal Sampling Density for Nonparametric Regression

05/25/2021
by   Danny Panknin, et al.
14

We propose a novel active learning strategy for regression, which is model-agnostic, robust against model mismatch, and interpretable. Assuming that a small number of initial samples are available, we derive the optimal training density that minimizes the generalization error of local polynomial smoothing (LPS) with its kernel bandwidth tuned locally: We adopt the mean integrated squared error (MISE) as a generalization criterion, and use the asymptotic behavior of the MISE as well as thelocally optimal bandwidths (LOB) – the bandwidth function that minimizes MISE in the asymptotic limit. The asymptotic expression of our objective then reveals the dependence of the MISE on the training density, enabling analytic minimization. As a result, we obtain the optimal training density in a closed-form. The almost model-free nature of our approach should encode raw properties of the target problem, and thus provide a robust and model-agnostic active learning strategy. Furthermore, the obtained training density factorizes the influence of local function complexity, noise leveland test density in a transparent and interpretable way. We validate our theory in numerical simulations, and show that the proposed active learning method outperforms the existing state-of-the-art model-agnostic approaches.

READ FULL TEXT
research
11/09/2020

Bayesian bandwidth estimation for local linear fitting in nonparametric regression models

This paper presents a Bayesian sampling approach to bandwidth estimation...
research
06/13/2018

Trapezoidal rule and sampling designs for the nonparametric estimation of the regression function in models with correlated errors

The problem of estimating the regression function in a fixed design mode...
research
05/04/2018

Axiomatic Approach to Variable Kernel Density Estimation

Variable kernel density estimation allows the approximation of a probabi...
research
10/24/2022

Active Learning for Single Neuron Models with Lipschitz Non-Linearities

We consider the problem of active learning for single neuron models, als...
research
07/28/2021

Robust and Active Learning for Deep Neural Network Regression

We describe a gradient-based method to discover local error maximizers o...
research
01/25/2016

A Robust UCB Scheme for Active Learning in Regression from Strategic Crowds

We study the problem of training an accurate linear regression model by ...
research
09/28/2018

Target-Independent Active Learning via Distribution-Splitting

To reduce the label complexity in Agnostic Active Learning (A^2 algorith...

Please sign up or login with your details

Forgot password? Click here to reset