A Theory of Selective Prediction

02/12/2019
by   Mingda Qiao, et al.
0

We consider a model of selective prediction, where the prediction algorithm is given a data sequence in an online fashion and asked to predict a pre-specified statistic of the upcoming data points. The algorithm is allowed to choose when to make the prediction as well as the length of the prediction window, possibly depending on the observations so far. We prove that, even without any distributional assumption on the input data stream, a large family of statistics can be estimated to non-trivial accuracy. To give one concrete example, suppose that we are given access to an arbitrary binary sequence x_1, ..., x_n of length n. Our goal is to accurately predict the average observation, and we are allowed to choose the window over which the prediction is made: for some t < n and m < n - t, after seeing t observations we predict the average of x_t+1, ..., x_t+m. We show that the expected squared error of our prediction can be bounded by O(1/ n), and prove a matching lower bound. This result holds for any sequence (that is not adaptive to when the prediction is made, or the predicted value), and the expectation of the error is with respect to the randomness of the prediction algorithm. Our results apply to more general statistics of a sequence of observations, and we highlight several open directions for future work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/08/2016

Prediction with a Short Memory

We consider the problem of predicting the next observation given a seque...
research
06/29/2021

Exponential Weights Algorithms for Selective Learning

We study the selective learning problem introduced by Qiao and Valiant (...
research
06/18/2012

Learning the Experts for Online Sequence Prediction

Online sequence prediction is the problem of predicting the next element...
research
11/29/2020

AWLCO: All-Window Length Co-Occurrence

Analyzing patterns in a sequence of events has applications in text anal...
research
09/27/2012

Reclassification formula that provides to surpass K-means method

The paper presents a formula for the reclassification of multidimensiona...
research
12/09/2021

Estimating the Longest Increasing Subsequence in Nearly Optimal Time

Longest Increasing Subsequence (LIS) is a fundamental statistic of a seq...
research
01/05/2021

Online Multivalid Learning: Means, Moments, and Prediction Intervals

We present a general, efficient technique for providing contextual predi...

Please sign up or login with your details

Forgot password? Click here to reset