Near-Optimal Entrywise Sampling for Data Matrices

11/19/2013
by   Dimitris Achlioptas, et al.
0

We consider the problem of selecting non-zero entries of a matrix A in order to produce a sparse sketch of it, B, that minimizes A-B_2. For large m × n matrices, such that n ≫ m (for example, representing n observations over m attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information regarding A. Second, they allow sketching of matrices whose non-zeros are presented to the algorithm in arbitrary order as a stream, with O(1) computation per non-zero. Third, the resulting sketch matrices are not only sparse, but their non-zero entries are highly compressible. Lastly, and most importantly, under mild assumptions, our distributions are provably competitive with the optimal offline distribution. Note that the probabilities in the optimal offline distribution may be complex functions of all the entries in the matrix. Therefore, regardless of computational complexity, the optimal distribution might be impossible to compute in the streaming model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2021

The full rank condition for sparse random matrices

We derive a sufficient condition for a sparse random matrix with given n...
research
06/11/2023

Learning the Positions in CountSketch

We consider sketching algorithms which first compress data by multiplica...
research
11/03/2020

Near-Optimal Entrywise Sampling of Numerically Sparse Matrices

Many real-world data sets are sparse or almost sparse. One method to mea...
research
10/29/2020

Active Sampling Count Sketch (ASCS) for Online Sparse Estimation of a Trillion Scale Covariance Matrix

Estimating and storing the covariance (or correlation) matrix of high-di...
research
01/24/2023

Logarithmically Sparse Symmetric Matrices

A positive definite matrix is called logarithmically sparse if its matri...
research
07/11/2019

Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension

The spectrum of a matrix contains important structural information about...
research
02/15/2022

Bohemian Matrix Geometry

A Bohemian matrix family is a set of matrices all of whose entries are d...

Please sign up or login with your details

Forgot password? Click here to reset