CustOmics: A versatile deep-learning based strategy for multi-omics integration

09/12/2022
by   Hakim Benkirane, et al.
21

Recent advances in high-throughput sequencing technologies have enabled the extraction of multiple features that depict patient samples at diverse and complementary molecular levels. The generation of such data has led to new challenges in computational biology regarding the integration of high-dimensional and heterogeneous datasets that capture the interrelationships between multiple genes and their functions. Thanks to their versatility and ability to learn synthetic latent representations of complex data, deep learning methods offer promising perspectives for integrating multi-omics data. These methods have led to the conception of many original architectures that are primarily based on autoencoder models. However, due to the difficulty of the task, the integration strategy is fundamental to take full advantage of the sources' particularities without losing the global trends. This paper presents a novel strategy to build a customizable autoencoder model that adapts to the dataset used in the case of high-dimensional multi-source integration. We will assess the impact of integration strategies on the latent representation and combine the best strategies to propose a new method, CustOmics (https://github.com/HakimBenkirane/CustOmics). We focus here on the integration of data from multiple omics sources and demonstrate the performance of the proposed method on test cases for several tasks such as classification and survival analysis.

READ FULL TEXT

page 5

page 8

page 10

page 13

page 14

research
08/31/2022

A Fair Experimental Comparison of Neural Network Architectures for Latent Representations of Multi-Omics for Drug Response Prediction

Recent years have seen a surge of novel neural network architectures for...
research
02/03/2021

OmiEmbed: reconstruct comprehensive phenotypic information from multi-omics data using multi-task deep learning

High-dimensional omics data contains intrinsic biomedical information th...
research
10/12/2020

BayReL: Bayesian Relational Learning for Multi-omics Data Integration

High-throughput molecular profiling technologies have produced high-dime...
research
10/21/2021

Towards modelling hazard factors in unstructured data spaces using gradient-based latent interpolation

The application of deep learning in survival analysis (SA) gives the opp...
research
02/05/2022

Memory Defense: More Robust Classification via a Memory-Masking Autoencoder

Many deep neural networks are susceptible to minute perturbations of ima...
research
01/30/2019

Deep Archetypal Analysis

"Deep Archetypal Analysis" generates latent representations of high-dime...

Please sign up or login with your details

Forgot password? Click here to reset