SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-omics Integration

04/17/2022
by   Jeongyoung Hwang, et al.
0

Integration of multi-omics data provides opportunities for revealing biological mechanisms related to certain phenotypes. We propose a novel method of multi-omics integration called supervised deep generalized canonical correlation analysis (SDGCCA) for modeling correlation structures between nonlinear multi-omics manifolds, aiming for improving classification of phenotypes and revealing biomarkers related to phenotypes. SDGCCA addresses the limitations of other canonical correlation analysis (CCA)-based models (e.g., deep CCA, deep generalized CCA) by considering complex/nonlinear cross-data correlations and discriminating phenotype groups. Although there are a few methods for nonlinear CCA projections for discriminant purposes of phenotypes, they only consider two views. On the other hand, SDGCCA is the nonlinear multiview CCA projection method for discrimination. When we applied SDGCCA to prediction of patients of Alzheimer's disease (AD) and discrimination of early- and late-stage cancers, it outperformed other CCA-based methods and other supervised methods. In addition, we demonstrate that SDGCCA can be used for feature selection to identify important multi-omics biomarkers. In the application on AD data, SDGCCA identified clusters of genes in multi-omics data, which are well known to be associated with AD.

READ FULL TEXT

page 2

page 3

page 4

page 6

page 9

page 13

page 14

page 16

research
04/30/2013

Generalized Canonical Correlation Analysis for Classification

For multiple multivariate data sets, we derive conditions under which Ge...
research
07/03/2019

Canonical Correlation Analysis (CCA) Based Multi-View Learning: An Overview

Multi-view learning (MVL) is a strategy for fusing data from different s...
research
02/08/2017

Deep Generalized Canonical Correlation Analysis

We present Deep Generalized Canonical Correlation Analysis (DGCCA) -- a ...
research
08/16/2016

Application of multiview techniques to NHANES dataset

Disease prediction or classification using health datasets involve using...
research
09/17/2017

Learning Mixtures of Multi-Output Regression Models by Correlation Clustering for Multi-View Data

In many datasets, different parts of the data may have their own pattern...
research
06/15/2021

Canonical-Correlation-Based Fast Feature Selection

This paper proposes a canonical-correlation-based filter method for feat...

Please sign up or login with your details

Forgot password? Click here to reset