Average Case Column Subset Selection for Entrywise ℓ_1-Norm Loss

04/16/2020
by   Zhao Song, et al.
0

We study the column subset selection problem with respect to the entrywise ℓ_1-norm loss. It is known that in the worst case, to obtain a good rank-k approximation to a matrix, one needs an arbitrarily large n^Ω(1) number of columns to obtain a (1+ϵ)-approximation to the best entrywise ℓ_1-norm low rank approximation of an n × n matrix. Nevertheless, we show that under certain minimal and realistic distributional settings, it is possible to obtain a (1+ϵ)-approximation with a nearly linear running time and poly(k/ϵ)+O(klog n) columns. Namely, we show that if the input matrix A has the form A = B + E, where B is an arbitrary rank-k matrix, and E is a matrix with i.i.d. entries drawn from any distribution μ for which the (1+γ)-th moment exists, for an arbitrarily small constant γ > 0, then it is possible to obtain a (1+ϵ)-approximate column subset selection to the entrywise ℓ_1-norm in nearly linear time. Conversely we show that if the first moment does not exist, then it is not possible to obtain a (1+ϵ)-approximate subset selection algorithm even if one chooses any n^o(1) columns. This is the first algorithm of any kind for achieving a (1+ϵ)-approximation for entrywise ℓ_1-norm loss low rank approximation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2020

Optimal ℓ_1 Column Subset Selection and a Fast PTAS for Low Rank Approximation

We study the problem of entrywise ℓ_1 low rank approximation. We give th...
research
04/19/2023

Column Subset Selection and Nyström Approximation via Continuous Optimization

We propose a continuous optimization algorithm for the Column Subset Sel...
research
04/18/2023

New Subset Selection Algorithms for Low Rank Approximation: Offline and Online

Subset selection for the rank k approximation of an n× d matrix A offers...
research
07/16/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

We give the first single-pass streaming algorithm for Column Subset Sele...
research
05/17/2015

Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data

We consider the problem of matrix column subset selection, which selects...
research
12/17/2007

An Approximation Ratio for Biclustering

The problem of biclustering consists of the simultaneous clustering of r...
research
06/17/2018

Subspace Embedding and Linear Regression with Orlicz Norm

We consider a generalization of the classic linear regression problem to...

Please sign up or login with your details

Forgot password? Click here to reset