`Basic' Generalization Error Bounds for Least Squares Regression with Well-specified Models
This note examines the behavior of generalization capabilities - as defined by out-of-sample mean squared error (MSE) - of Linear Gaussian (with a fixed design matrix) and Linear Least Squares regression. Particularly, we consider a well-specified model setting, i.e. we assume that there exists a `true' combination of model parameters within the chosen model form. While the statistical properties of Least Squares regression have been extensively studied over the past few decades - particularly with less restrictive problem statements compared to the present work - this note targets bounds that are non-asymptotic and more quantitative compared to the literature. Further, the analytical formulae for distributions and bounds (on the MSE) are directly compared to numerical experiments. Derivations are presented in a self-contained and pedagogical manner, in a way that a reader with a basic knowledge of probability and statistics can follow.
READ FULL TEXT