Multi-Observation Regression

02/27/2018
by   Rafael Frongillo, et al.
0

Recent work introduced loss functions which measure the error of a prediction based on multiple simultaneous observations or outcomes. In this paper, we explore the theoretical and practical questions that arise when using such multi-observation losses for regression on data sets of (x,y) pairs. When a loss depends on only one observation, the average empirical loss decomposes by applying the loss to each pair, but for the multi-observation case, empirical loss is not even well-defined, and the possibility of statistical guarantees is unclear without several (x,y) pairs with exactly the same x value. We propose four algorithms formalizing the concept of empirical risk minimization for this problem, two of which have statistical guarantees in settings allowing both slow and fast convergence rates, but which are out-performed empirically by the other two. Empirical results demonstrate practicality of these algorithms in low-dimensional settings, while lower bounds demonstrate intrinsic difficulty in higher dimensions. Finally, we demonstrate the potential benefit of the algorithms over natural baselines that use traditional single-observation losses via both lower bounds and simulations.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset