A Multiple Regression-Enhanced Convolution Estimator for the Density of a Response Variable in the Presence of Additional Covariate Information
In this paper we propose a convolution estimator for estimating the density of a response variable that employs an underlying multiple regression framework to enhance the accuracy of density estimates through the incorporation of auxiliary information. Suppose we have a sample consisting of N complete case observations of a response variable and an associated set of covariates, along with an additional sample consisting of M observations of the covariates only. We show that the mean square error of the multiple regression-enhanced convolution estimator converges as O(N^-1) towards zero, and moreover, for a large fixed N, that the mean square error converges as O(M^-4/5) towards an O(N^-1) constant. This is the first time that the convergence of a convolution estimator with respect to the amount of additional covariate information has been established. In contrast to convolution estimators based on the Nadaraya-Watson estimator for a nonlinear regression model, the multiple regression-enhanced convolution estimator proposed in this paper does not suffer from the curse of dimensionality. It is particularly useful for scenarios in which one wants to estimate the density of a response variable that is challenging to measure, while being in possession of a large amount of additional covariate information. In fact, an application of this type from the field of ophthalmology motivated our work in this paper.
READ FULL TEXT