Learning a Latent Simplex in Input-Sparsity Time

05/17/2021
by   Ainesh Bakshi, et al.
0

We consider the problem of learning a latent k-vertex simplex K⊂ℝ^d, given access to A∈ℝ^d× n, which can be viewed as a data matrix with n points that are obtained by randomly perturbing latent points in the simplex K (potentially beyond K). A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models, and topic models can be cast as learning a latent simplex. Bhattacharyya and Kannan (SODA, 2020) give an algorithm for learning such a latent simplex in time roughly O(k·nnz(A)), where nnz(A) is the number of non-zeros in A. We show that the dependence on k in the running time is unnecessary given a natural assumption about the mass of the top k singular values of A, which holds in many of these applications. Further, we show this assumption is necessary, as otherwise an algorithm for learning a latent simplex would imply an algorithmic breakthrough for spectral low rank approximation. At a high level, Bhattacharyya and Kannan provide an adaptive algorithm that makes k matrix-vector product queries to A and each query is a function of all queries preceding it. Since each matrix-vector product requires nnz(A) time, their overall running time appears unavoidable. Instead, we obtain a low-rank approximation to A in input-sparsity time and show that the column space thus obtained has small sinΘ (angular) distance to the right top-k singular space of A. Our algorithm then selects k points in the low-rank subspace with the largest inner product with k carefully chosen random vectors. By working in the low-rank subspace, we avoid reading the entire matrix in each iteration and thus circumvent the Θ(k·nnz(A)) running time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2019

Optimal Sketching for Kronecker Product Regression and Low Rank Approximation

We study the Kronecker product regression problem, in which the design m...
research
11/01/2021

Improved Algorithms for Low Rank Approximation from Sparsity

We overcome two major bottlenecks in the study of low rank approximation...
research
04/12/2019

Low-rank binary matrix approximation in column-sum norm

We consider ℓ_1-Rank-r Approximation over GF(2), where for a binary m× n...
research
05/21/2017

Nice latent variable models have log-rank

Matrices of low rank are pervasive in big data, appearing in recommender...
research
04/14/2019

Finding a latent k-simplex in O(k . nnz(data)) time via Subset Smoothing

The core problem in many Latent Variable Models, widely used in Unsuperv...
research
02/10/2018

Low-Rank Methods in Event Detection

We present low-rank methods for event detection. We assume that normal o...
research
05/04/2023

A Spectral Method for Identifiable Grade of Membership Analysis with Binary Responses

Grade of Membership (GoM) models are popular individual-level mixture mo...

Please sign up or login with your details

Forgot password? Click here to reset