A Model-free Approach to Linear Least Squares Regression with Exact Probabilities and Applications to Covariate Selection

06/05/2019
by   Laurie Davies, et al.
0

The classical model for linear regression is Y= xβ +σε with i.i.d. standard Gaussian errors. Much of the resulting statistical inference is based on Fisher's F-distribution. In this paper we give two approaches to least squares regression which are model free. The results hold forall data ( y, x). The derived probabilities are not only exact, they agree with those using the F-distribution based on the classical model. This is achieved by replacing questions about the size of β_j, for example β_j=0, by questions about the degree to which the covariate x_j is better than Gaussian white noise or, alternatively, a random orthogonal rotation of x_j. The idea can be extended to choice of covariates, post selection inference PoSI, step-wise choice of covariates, the determination of dependency graphs and to robust regression and non-linear regression. In the latter two cases the probabilities are no longer exact but are based on the chi-squared distribution. The step-wise choice of covariates is of particular interest: it is a very simple, very fast, very powerful, it controls the number of false positives and does not over fit even in the case where the number of covariates far exceeds the sample size

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset