Conditional predictive inference for high-dimensional stable algorithms

09/05/2018
by   Lukas Steinberger, et al.
0

We investigate generically applicable and intuitively appealing prediction intervals based on leave-one-out residuals. The conditional coverage probability of the proposed interval, given the observations in the training sample, is close to the nominal level, provided that the underlying algorithm used for computing point predictions is sufficiently stable under the omission of single feature-response pairs. Our results are based on a finite sample analysis of the empirical distribution function of the leaveone- out residuals and hold in a non-parametric setting with only minimal assumptions on the error distribution. To illustrate our results, we also apply them to high-dimensional linear predictors, where we obtain uniform asymptotic conditional validity as both sample size and dimension tend to infinity at the same rate. These results show that despite the serious problems of resampling procedures for inference on the unknown parameters (cf. Bickel and Freedman, 1983; El Karoui and Purdom, 2015; Mammen, 1996), leave-one-out methods can be successfully applied to obtain reliable predictive inference even in high dimensions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2020

Provable More Data Hurt in High Dimensional Least Squares Estimator

This paper investigates the finite-sample prediction risk of the high-di...
research
10/31/2022

Exact and Approximate Conformal Inference in Multiple Dimensions

It is common in machine learning to estimate a response y given covariat...
research
02/24/2014

Predictive Interval Models for Non-parametric Regression

Having a regression model, we are interested in finding two-sided interv...
research
02/15/2021

Approximation to Object Conditional Validity with Conformal Predictors

Conformal predictors are machine learning algorithms that output predict...
research
12/19/2021

Stable Conformal Prediction Sets

When one observes a sequence of variables (x_1, y_1), ..., (x_n, y_n), c...
research
05/28/2021

Distribution-free inference for regression: discrete, continuous, and in between

In data analysis problems where we are not able to rely on distributiona...
research
06/16/2020

Risk bounds when learning infinitely many response functions by ordinary linear regression

Consider the problem of learning a large number of response functions si...

Please sign up or login with your details

Forgot password? Click here to reset