A Large Dimensional Analysis of Least Squares Support Vector Machines
In this article, a large dimensional performance analysis of kernel least squares support vector machines (LS-SVMs) is provided under the assumption of a two-class Gaussian mixture model for the input data. Building upon recent random matrix advances, when both the dimension of data p and their number n grow large at the same rate, we show that the LS-SVM decision function converges to a normal-distributed variable, the mean and variance of which depend explicitly on a local behavior of the kernel function. This theoretical result is then applied to the MNIST data sets which, despite their non-Gaussianity, exhibit a surprisingly similar behavior. Our analysis provides a deeper understanding of the mechanism into play in SVM-type methods and in particular of the impact on the choice of the kernel function as well as some of their theoretical limits.
READ FULL TEXT