An Equivalence Principle for the Spectrum of Random Inner-Product Kernel Matrices
We consider random matrices whose entries are obtained by applying a (nonlinear) kernel function to the pairwise inner products between n independent data vectors drawn uniformly from the unit sphere in ℝ^d. Our study of this model is motivated by problems in machine learning, statistics, and signal processing, where such inner-product kernel random matrices and their spectral properties play important roles. Under mild conditions on the kernel function, we establish the weak-limit of the empirical spectral distribution of these matrices when d, n →∞ such that n / d^ℓ→κ∈ (0, ∞), for some fixed ℓ∈ℕ and κ∈ℝ. This generalizes an earlier result of Cheng and Singer (2013), who studied the same model in the linear scaling regime (with ℓ = 1 and n/d →κ). The main insight of our work is a general equivalence principle: the spectrum of the random kernel matrix is asymptotically equivalent to that of a simpler matrix model, constructed as the linear combination of a (shifted) Wishart matrix and an independent matrix drawn from the Gaussian orthogonal ensemble. The aspect ratio of the Wishart matrix and the coefficients of the linear combination are determined by ℓ and by the expansion of the kernel function in the orthogonal Hermite polynomial basis. Consequently, the limiting spectrum of the random kernel matrix can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
READ FULL TEXT