Supervised multiway factorization

09/11/2016
by   Eric F. Lock, et al.
0

We describe a probabilistic PARAFAC/CANDECOMP (CP) factorization for multiway (i.e., tensor) data that incorporates auxiliary covariates, SupCP. SupCP generalizes the supervised singular value decomposition (SupSVD) for vector-valued observations, to allow for observations that have the form of a matrix or higher-order array. Such data are increasingly encountered in biomedical research and other fields. We describe a likelihood-based latent variable representation of the CP factorization, in which the latent variables are informed by additional covariates. We give conditions for identifiability, and develop an EM algorithm for simultaneous estimation of all model parameters. SupCP can be used for dimension reduction, capturing latent structures that are more accurate and interpretable due to covariate supervision. Moreover, SupCP specifies a full probability distribution for a multiway data observation with given covariate values, which can be used for predictive modeling. We conduct comprehensive simulations to evaluate the SupCP algorithm, and we apply it to a facial image database with facial descriptors (e.g., smiling / not smiling) as covariates. Software is available at https://github.com/lockEF/SupCP .

READ FULL TEXT

page 18

page 21

research
01/04/2017

Tensor-on-tensor regression

We propose a framework for the linear prediction of a multi-way array (i...
research
07/21/2022

Order Determination for Tensor-valued Observations Using Data Augmentation

Tensor-valued data benefits greatly from dimension reduction as the redu...
research
10/27/2016

Stratification of patient trajectories using covariate latent variable models

Standard models assign disease progression to discrete categories or sta...
research
01/25/2021

Identifying Interpretable Discrete Latent Structures from Discrete Data

High dimensional categorical data are routinely collected in biomedical ...
research
10/12/2021

Nonnegative spatial factorization

Gaussian processes are widely used for the analysis of spatial data due ...
research
09/14/2020

Sufficient Dimension Reduction for Average Causal Effect Estimation

Having a large number of covariates can have a negative impact on the qu...
research
08/09/2020

Generalized Liquid Association Analysis for Multimodal Data Integration

Multimodal data are now prevailing in scientific research. A central que...

Please sign up or login with your details

Forgot password? Click here to reset