How many variables should be entered in a principal component regression equation?
We study least squares linear regression over N uncorrelated Gaussian features that are selected in order of decreasing variance. When the number of selected features p is at most the sample size n, the estimator under consideration coincides with the principal component regression estimator; when p>n, the estimator is the least ℓ_2 norm solution over the selected features. We give an average-case analysis of the out-of-sample prediction error as p,n,N →∞ with p/N →α and n/N →β, for some constants α∈ [0,1] and β∈ (0,1). In this average-case setting, the prediction error exhibits a `double descent' shape as a function of p.READ FULL TEXT