Learning from Non-IID Data in Hilbert Spaces: An Optimal Recovery Perspective

by   Simon Foucart, et al.

The notion of generalization in classical Statistical Learning is often attached to the postulate that data points are independent and identically distributed (IID) random variables. While relevant in many applications, this postulate may not hold in general, encouraging the development of learning frameworks that are robust to non-IID data. In this work, we consider the regression problem from an Optimal Recovery perspective. Relying on a model assumption comparable to choosing a hypothesis class, a learner aims at minimizing the worst-case (prediction) error, without recourse to IID assumption on data. We first develop a semidefinite program for calculating the worst-case error of any recovery map in finite-dimensional Hilbert spaces. Then, for any Hilbert space, we show that Optimal Recovery provides a formula which is user-friendly from an algorithmic point-of-view, as long as the hypothesis class is linear. Interestingly, this formula coincides with kernel ridgeless regression in some cases, proving that minimizing the average error and worst-case error can yield the same solution. We provide numerical experiments in support of our theoretical findings.


page 1

page 2

page 3

page 4


Worst case recovery guarantees for least squares approximation using random samples

We consider a least squares regression algorithm for the recovery of com...

Optimal Recovery from Inaccurate Data in Hilbert Spaces: Regularize, but what of the Parameter?

In Optimal Recovery, the task of learning a function from observational ...

On L_2-approximation in Hilbert spaces using function values

We study L_2-approximation of functions from Hilbert spaces H in which f...

Informative Planning for Worst-Case Error Minimisation in Sparse Gaussian Process Regression

We present a planning framework for minimising the deterministic worst-c...

Bayesian quadrature for H^1(μ) with Poincaré inequality on a compact interval

Motivated by uncertainty quantification of complex systems, we aim at fi...

On the Optimal Recovery of Graph Signals

Learning a smooth graph signal from partially observed data is a well-st...

Full Recovery from Point Values: an Optimal Algorithm for Chebyshev Approximability Prior

Given pointwise samples of an unknown function belonging to a certain mo...

Please sign up or login with your details

Forgot password? Click here to reset