Sparse Coresets for SVD on Infinite Streams

02/15/2020
by   Vladimir Braverman, et al.
0

In streaming Singular Value Decomposition (SVD), d-dimensional rows of a possibly infinite matrix arrive sequentially as points in R^d. An ϵ-coreset is a (much smaller) matrix whose sum of square distances of the rows to any hyperplane approximates that of the original matrix to a 1 ±ϵ factor. Our main result is that we can maintain a ϵ-coreset while storing only O(d log^2 d / ϵ^2) rows. Known lower bounds of Ω(d / ϵ^2) rows show that this is nearly optimal. Moreover, each row of our coreset is a weighted subset of the input rows. This is highly desirable since it: (1) preserves sparsity; (2) is easily interpretable; (3) avoids precision errors; (4) applies to problems with constraints on the input. Previous streaming results for SVD that return a subset of the input required storing Ω(d log^3 n / ϵ^2) rows where n is the number of rows seen so far. Our algorithm, with storage independent of n, is the first result that uses finite memory on infinite streams. We support our findings with experiments on the Wikipedia dataset benchmarked against state-of-the-art algorithms.

READ FULL TEXT
research
09/10/2022

A mixed precision Jacobi SVD algorithm

We propose a mixed precision Jacobi algorithm for computing the singular...
research
07/02/2019

Tight Sensitivity Bounds For Smaller Coresets

An ε-coreset for Least-Mean-Squares (LMS) of a matrix A∈R^n× d is a smal...
research
11/28/2018

SVD-PHAT: A Fast Sound Source Localization Method

This paper introduces a new localization method called SVD-PHAT. The SVD...
research
12/12/2017

Approximate Convex Hull of Data Streams

Given a finite set of points P ⊆R^d, we would like to find a small subse...
research
08/17/2022

Distributed Out-of-Memory SVD on CPU/GPU Architectures

We propose an efficient, distributed, out-of-memory implementation of th...
research
12/05/2012

Using Wikipedia to Boost SVD Recommender Systems

Singular Value Decomposition (SVD) has been used successfully in recent ...
research
10/08/2020

Deep Learning Meets Projective Clustering

A common approach for compressing NLP networks is to encode the embeddin...

Please sign up or login with your details

Forgot password? Click here to reset