DeepAI AI Chat
Log In Sign Up

Conditional predictive inference for high-dimensional stable algorithms

by   Lukas Steinberger, et al.
University of Freiburg
Universität Wien

We investigate generically applicable and intuitively appealing prediction intervals based on leave-one-out residuals. The conditional coverage probability of the proposed interval, given the observations in the training sample, is close to the nominal level, provided that the underlying algorithm used for computing point predictions is sufficiently stable under the omission of single feature-response pairs. Our results are based on a finite sample analysis of the empirical distribution function of the leaveone- out residuals and hold in a non-parametric setting with only minimal assumptions on the error distribution. To illustrate our results, we also apply them to high-dimensional linear predictors, where we obtain uniform asymptotic conditional validity as both sample size and dimension tend to infinity at the same rate. These results show that despite the serious problems of resampling procedures for inference on the unknown parameters (cf. Bickel and Freedman, 1983; El Karoui and Purdom, 2015; Mammen, 1996), leave-one-out methods can be successfully applied to obtain reliable predictive inference even in high dimensions.


page 1

page 2

page 3

page 4


Provable More Data Hurt in High Dimensional Least Squares Estimator

This paper investigates the finite-sample prediction risk of the high-di...

Exact and Approximate Conformal Inference in Multiple Dimensions

It is common in machine learning to estimate a response y given covariat...

Predictive Interval Models for Non-parametric Regression

Having a regression model, we are interested in finding two-sided interv...

Approximation to Object Conditional Validity with Conformal Predictors

Conformal predictors are machine learning algorithms that output predict...

Stable Conformal Prediction Sets

When one observes a sequence of variables (x_1, y_1), ..., (x_n, y_n), c...

Distribution-free inference for regression: discrete, continuous, and in between

In data analysis problems where we are not able to rely on distributiona...

Risk bounds when learning infinitely many response functions by ordinary linear regression

Consider the problem of learning a large number of response functions si...