A Distribution Free Truncated Kernel Ridge Regression Estimator and Related Spectral Analyses
It is well known that kernel ridge regression (KRR) is a popular nonparametric regression estimator. Nonetheless, in the presence of a large data set with size n≫ 1, the KRR estimator has the drawback to require an intensive computational load. Recently, scalable KRR approaches have been proposed with the aims to reduce the computational complexity of the KRR, while maintaining its superb convergence rate. In this work, we study a new scalable KRR based approach for nonparametric regression. Our truncated kernel ridge regression (TKRR) approach is simple. It is based on substituting the full n× n random kernel or Gram matrix B_n, associated with a Mercer's kernel 𝕂, by its main n× N sub-matrix A_N, where usually N ≪ n. Also, we show that the TKRR works with d-dimensional random sampling data following an unknown probability law. To do so, we give a spectral analysis for the compact kernel integral operator, associated with a probability measure, different from its usual probability measure. This decay estimate is then extended to the decay of the tail of the trace of the associated random Gram matrix. A special interest is devoted to develop rules for the optimal choices of the involved truncation order N and the value for regularization parameter λ >0. The proposed rules are based on the behavior and the decay rate of the spectrum of the positive integral operator, associated with the kernel 𝕂. These optimal values of the parameters ensure that in terms of the empirical risk error, the TKRR and the full KRR estimators have the same optimal convergence rate. Finally, we provide the reader with some numerical simulations that illustrate the performance of our proposed TKRR estimator.
READ FULL TEXT