Distinguishing between Normal and Cancer Cells Using Autoencoder Node Saliency

01/30/2019
by   Ya Ju Fan, et al.
0

Gene expression profiles have been widely used to characterize patterns of cellular responses to diseases. As data becomes available, scalable learning toolkits become essential to processing large datasets using deep learning models to model complex biological processes. We present an autoencoder to capture nonlinear relationships recovered from gene expression profiles. The autoencoder is a nonlinear dimension reduction technique using an artificial neural network, which learns hidden representations of unlabeled data. We train the autoencoder on a large collection of tumor samples from the National Cancer Institute Genomic Data Commons, and obtain a generalized and unsupervised latent representation. We leverage a HPC-focused deep learning toolkit, Livermore Big Artificial Neural Network (LBANN) to efficiently parallelize the training algorithm, reducing computation times from several hours to a few minutes. With the trained autoencoder, we generate latent representations of a small dataset, containing pairs of normal and cancer cells of various tumor types. A novel measure called autoencoder node saliency (ANS) is introduced to identify the hidden nodes that best differentiate various pairs of cells. We compare our findings of the best classifying nodes with principal component analysis and the visualization of t-distributed stochastic neighbor embedding. We demonstrate that the autoencoder effectively extracts distinct gene features for multiple learning tasks in the dataset.

READ FULL TEXT

page 1

page 3

research
11/21/2017

Autoencoder Node Saliency: Selecting Relevant Latent Representations

The autoencoder is an artificial neural network model that learns hidden...
research
05/21/2018

GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization

Bioinformatics tools have been developed to interpret gene expression da...
research
06/18/2019

Learning data representation using modified autoencoder for the integrative analysis of multi-omics data

In integrative analyses of omics data, it is often of interest to extrac...
research
05/02/2018

Prediction of a Gene Regulatory Network from Gene Expression Profiles With Linear Regression and Pearson Correlation Coefficient

Reconstruction of gene regulatory networks is the process of identifying...
research
07/12/2017

Elephant Search with Deep Learning for Microarray Data Analysis

Even though there is a plethora of research in Microarray gene expressio...
research
01/15/2020

Autoencoders as Weight Initialization of Deep Classification Networks for Cancer versus Cancer Studies

Cancer is still one of the most devastating diseases of our time. One wa...
research
07/26/2020

BIDEAL: A Toolbox for Bicluster Analysis – Generation, Visualization and Validation

This paper introduces a novel toolbox named BIDEAL for the generation of...

Please sign up or login with your details

Forgot password? Click here to reset