In this work, we design, analyze, and optimize sequential and shared-mem...
The Tucker tensor decomposition is a natural extension of the singular v...
Multiple Tensor-Times-Matrix (Multi-TTM) is a key computation in algorit...
Communication lower bounds have long been established for matrix
multipl...
The CP tensor decomposition is used in applications such as machine lear...
The Tensor-Train (TT) format is a highly compact low-rank representation...
We present efficient and scalable parallel algorithms for performing
mat...
We introduce a Generalized Randomized QR-decomposition that may be appli...
We consider the problem of low-rank approximation of massive dense
non-n...
We consider the problem of joint three-dimensional (3D) localization and...
Our goal is compression of massive-scale grid-structured data, such as t...
The CP tensor decomposition is a low-rank approximation of a tensor. We
...
Interprocessor communication often dominates the runtime of large matrix...
This is the second in a series of papers on rank decompositions of the m...
The CANDECOMP/PARAFAC (CP) decomposition is a leading method for the ana...
Non-negative matrix factorization (NMF) is the problem of determining tw...