Nonparametric classification with missing data

05/19/2023
by   Torben Sell, et al.
0

We introduce a new nonparametric framework for classification problems in the presence of missing data. The key aspect of our framework is that the regression function decomposes into an anova-type sum of orthogonal functions, of which some (or even many) may be zero. Working under a general missingness setting, which allows features to be missing not at random, our main goal is to derive the minimax rate for the excess risk in this problem. In addition to the decomposition property, the rate depends on parameters that control the tail behaviour of the marginal feature distributions, the smoothness of the regression function and a margin condition. The ambient data dimension does not appear in the minimax rate, which can therefore be faster than in the classical nonparametric setting. We further propose a new method, called the Hard-thresholding Anova Missing data (HAM) classifier, based on a careful combination of a k-nearest neighbour algorithm and a thresholding step. The HAM classifier attains the minimax rate up to polylogarithmic factors and numerical experiments further illustrate its utility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2019

Full-semiparametric-likelihood-based inference for non-ignorable missing data

During the past few decades, missing-data problems have been studied ext...
research
11/09/2020

A Computationally Efficient Classification Algorithm in Posterior Drift Model: Phase Transition and Minimax Adaptivity

In massive data analysis, training and testing data often come from very...
research
02/03/2022

Minimax rate of consistency for linear models with missing values

Missing values arise in most real-world data sets due to the aggregation...
research
06/27/2022

Benign overfitting and adaptive nonparametric regression

In the nonparametric regression setting, we construct an estimator which...
research
11/28/2020

Adaptive Inference in Multivariate Nonparametric Regression Models Under Monotonicity

We consider the problem of adaptive inference on a regression function a...
research
06/07/2019

Transfer Learning for Nonparametric Classification: Minimax Rate and Adaptive Classifier

Human learners have the natural ability to use knowledge gained in one s...
research
07/18/2022

Deeply-Learned Generalized Linear Models with Missing Data

Deep Learning (DL) methods have dramatically increased in popularity in ...

Please sign up or login with your details

Forgot password? Click here to reset