Predictive inference with the jackknife+

05/08/2019
by   Rina Foygel Barber, et al.
0

This paper introduces the jackknife+, which is a novel method for constructing predictive confidence intervals. Whereas the jackknife outputs an interval centered at the predicted response of a test point, with the width of the interval determined by the quantiles of leave-one-out residuals, the jackknife+ also uses the leave-one-out predictions at the test point to account for the variability in the fitted regression function. Assuming exchangeable training samples, we prove that this crucial modification permits rigorous coverage guarantees regardless of the distribution of the data points, for any algorithm that treats the training points symmetrically. Such guarantees are not possible for the original jackknife and we demonstrate examples where the coverage rate may actually vanish. Our theoretical and empirical analysis reveals that the jackknife and the jackknife+ intervals achieve nearly exact coverage and have similar lengths whenever the fitting algorithm obeys some form of stability. Further, we extend the jackknife+ to K-fold cross validation and similarly establish rigorous coverage properties. Our methods are related to cross-conformal prediction proposed by Vovk [2015] and we discuss connections.

READ FULL TEXT
research
07/24/2020

Cross-validation Confidence Intervals for Test Error

This work develops central limit theorems for cross-validation and consi...
research
04/01/2021

Cross-validation: what does it estimate and how well does it do it?

Cross-validation is a widely-used technique to estimate prediction error...
research
06/11/2023

Fast, Distribution-free Predictive Inference for Neural Networks with Coverage Guarantees

This paper introduces a novel, computationally-efficient algorithm for p...
research
08/15/2019

With Malice Towards None: Assessing Uncertainty via Equalized Coverage

An important factor to guarantee a fair use of data-driven recommendatio...
research
06/30/2020

Conformal Prediction Intervals for Neural Networks Using Cross Validation

Neural networks are among the most powerful nonlinear models used to add...
research
05/28/2021

Distribution-free inference for regression: discrete, continuous, and in between

In data analysis problems where we are not able to rely on distributiona...
research
08/10/2020

Rapid Approximate Aggregation with Distribution-Sensitive Interval Guarantees

Aggregating data is fundamental to data analytics, data exploration, and...

Please sign up or login with your details

Forgot password? Click here to reset