Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors

04/17/2014
by   Morteza Mardani, et al.
0

Extracting latent low-dimensional structure from high-dimensional data is of paramount importance in timely inference tasks encountered with `Big Data' analytics. However, increasingly noisy, heterogeneous, and incomplete datasets as well as the need for real-time processing of streaming data pose major challenges to this end. In this context, the present paper permeates benefits from rank minimization to scalable imputation of missing data, via tracking low-dimensional subspaces and unraveling latent (possibly multi-way) structure from incomplete streaming data. For low-rank matrix data, a subspace estimator is proposed based on an exponentially-weighted least-squares criterion regularized with the nuclear norm. After recasting the non-separable nuclear norm into a form amenable to online optimization, real-time algorithms with complementary strengths are developed and their convergence is established under simplifying technical assumptions. In a stationary setting, the asymptotic estimates obtained offer the well-documented performance guarantees of the batch nuclear-norm regularized estimator. Under the same unifying framework, a novel online (adaptive) algorithm is developed to obtain multi-way decompositions of low-rank tensors with missing entries, and perform imputation as a byproduct. Simulated tests with both synthetic as well as real Internet and cardiac magnetic resonance imagery (MRI) data confirm the efficacy of the proposed algorithms, and their superior performance relative to state-of-the-art alternatives.

READ FULL TEXT
research
09/27/2016

Online Categorical Subspace Learning for Sketching Big Data with Misses

With the scale of data growing every day, reducing the dimensionality (a...
research
12/17/2018

ℓ_0-Motivated Low-Rank Sparse Subspace Clustering

In many applications, high-dimensional data points can be well represent...
research
09/14/2016

Tracking Tensor Subspaces with Informative Random Sampling for Real-Time MR Imaging

Magnetic resonance imaging (MRI) nowadays serves as an important modalit...
research
02/11/2016

Online Low-Rank Subspace Learning from Incomplete Data: A Bayesian View

Extracting the underlying low-dimensional space where high-dimensional s...
research
09/12/2016

Online Data Thinning via Multi-Subspace Tracking

In an era of ubiquitous large-scale streaming data, the availability of ...
research
01/22/2015

Sketch and Validate for Big Data Clustering

In response to the need for learning tools tuned to big data analytics, ...
research
06/12/2018

Streaming PCA and Subspace Tracking: The Missing Data Case

For many modern applications in science and engineering, data are collec...

Please sign up or login with your details

Forgot password? Click here to reset