Structural Learning and Integrative Decomposition of Multi-View Data

by   Irina Gaynanova, et al.

The increased availability of the multi-view data (data on the same samples from multiple sources) has led to strong interest in models based on low-rank matrix factorizations. These models represent each data view via shared and individual components, and have been successfully applied for exploratory dimension reduction, association analysis between the views, and further learning tasks such as consensus clustering. Despite these advances, there remain significant challenges in modeling partially-shared components, and identifying the number of components of each type (shared/partially-shared/individual). In this work, we formulate a novel linked component model that directly incorporates partially-shared structures. We call this model SLIDE for Structural Learning and Integrative DEcomposition of multi-view data. We prove the existence of SLIDE decomposition and explicitly characterize the identifiability conditions. The proposed model fitting and selection techniques allow for joint identification of the number of components of each type, in contrast to existing sequential approaches. In our empirical studies, SLIDE demonstrates excellent performance in both signal estimation and component selection. We further illustrate the methodology on the breast cancer data from The Cancer Genome Atlas repository.



There are no comments yet.


page 16

page 18


MM-PCA: Integrative Analysis of Multi-group and Multi-view Data

Data integration is the problem of combining multiple data groups (studi...

Partially Shared Semi-supervised Deep Matrix Factorization with Multi-view Data

Since many real-world data can be described from multiple views, multi-v...

Integrative Factorization of Bidimensionally Linked Matrices

Advances in molecular "omics'" technologies have motivated new methodolo...

Principal Structure Identification: Fast Disentanglement of Multi-source Dataset

Analysis of multi-source data, where data on the same objects are collec...

Directionally Dependent Multi-View Clustering Using Copula Model

In recent biomedical scientific problems, it is a fundamental issue to i...

Double-matched matrix decomposition for multi-view data

We consider the problem of extracting joint and individual signals from ...

Bidimensional linked matrix factorization for pan-omics pan-cancer analysis

Several modern applications require the integration of multiple large da...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.