MM-PCA: Integrative Analysis of Multi-group and Multi-view Data

11/12/2019
by   Jonatan Kallus, et al.
0

Data integration is the problem of combining multiple data groups (studies, cohorts) and/or multiple data views (variables, features). This task is becoming increasingly important in many disciplines due to the prevalence of large and heterogeneous data sets. Data integration commonly aims to identify structure that is consistent across multiple cohorts and feature sets. While such joint analyses can boost information from single data sets, it is also possible that a globally restrictive integration of heterogeneous data may obscure signal of interest. Here, we therefore propose a data adaptive integration method, allowing for structure in data to be shared across an a priori unknown subset of cohorts and views. The method, Multi-group Multi-view Principal Component Analysis (MM-PCA), identifies partially shared, sparse low-rank components. This also results in an integrative bi-clustering across cohorts and views. The strengths of MM-PCA are illustrated on simulated data and on 'omics data from The Cancer Genome Atlas. MM-PCA is available as an R-package. Key words: Data integration, Multi-view, Multi-group, Bi-clustering

READ FULL TEXT

page 22

page 24

page 25

page 41

page 42

research
08/25/2016

Multi-View Fuzzy Clustering with Minimax Optimization for Effective Clustering of Data from Multiple Sources

Multi-view data clustering refers to categorizing a data set by making g...
research
07/20/2017

Structural Learning and Integrative Decomposition of Multi-View Data

The increased availability of the multi-view data (data on the same samp...
research
12/01/2022

Data Integration Via Analysis of Subspaces (DIVAS)

Modern data collection in many data paradigms, including bioinformatics,...
research
10/02/2020

Deep Incomplete Multi-View Multiple Clusterings

Multi-view clustering aims at exploiting information from multiple heter...
research
06/26/2022

Hierarchical nuclear norm penalization for multi-view data

The prevalence of data collected on the same set of samples from multipl...
research
02/07/2018

Multi-View Bayesian Correlated Component Analysis

Correlated component analysis as proposed by Dmochowski et al. (2012) is...
research
07/17/2022

Personalized PCA: Decoupling Shared and Unique Features

In this paper, we tackle a significant challenge in PCA: heterogeneity. ...

Please sign up or login with your details

Forgot password? Click here to reset