Improved guarantees and a multiple-descent curve for the Column Subset Selection Problem and the Nyström method

02/21/2020
by   Michał Dereziński, et al.
0

The Column Subset Selection Problem (CSSP) and the Nyström method are among the leading tools for constructing small low-rank approximations of large datasets in machine learning and scientific computing. A fundamental question in this area is: how well can a data subset of size k compete with the best rank k approximation? We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees which go beyond the standard worst-case analysis. Our approach leads to significantly better bounds for datasets with known rates of singular value decay, e.g., polynomial or exponential decay. Our analysis also reveals an intriguing phenomenon: the approximation factor as a function of k may exhibit multiple peaks and valleys, which we call a multiple-descent curve. A lower bound we establish shows that this behavior is not an artifact of our analysis, but rather it is an inherent property of the CSSP and Nyström tasks. Finally, using the example of a radial basis function (RBF) kernel, we show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2023

Column Subset Selection and Nyström Approximation via Continuous Optimization

We propose a continuous optimization algorithm for the Column Subset Sel...
research
08/16/2019

Low-rank approximation in the Frobenius norm by column and row subset selection

A CUR approximation of a matrix A is a particular type of low-rank appro...
research
06/23/2017

On the numerical rank of radial basis function kernel matrices in high dimension

Low-rank approximations are popular techniques to reduce the high comput...
research
10/30/2019

Optimal Analysis of Subset-Selection Based L_p Low Rank Approximation

We study the low rank approximation problem of any given matrix A over R...
research
03/06/2018

Flip-Flop Spectrum-Revealing QR Factorization and Its Applications on Singular Value Decomposition

We present Flip-Flop Spectrum-Revealing QR (Flip-Flop SRQR) factorizatio...
research
09/04/2010

On the Estimation of Coherence

Low-rank matrix approximations are often used to help scale standard mac...
research
06/16/2022

Generalized Leverage Scores: Geometric Interpretation and Applications

In problems involving matrix computations, the concept of leverage has f...

Please sign up or login with your details

Forgot password? Click here to reset