Convolutional neural network models for cancer type prediction based on gene expression

06/18/2019
by   Milad Mostavi, et al.
0

Background Precise prediction of cancer types is vital for cancer diagnosis and therapy. Important cancer marker genes can be inferred through predictive model. Several studies have attempted to build machine learning models for this task however none has taken into consideration the effects of tissue of origin that can potentially bias the identification of cancer markers. Results In this paper, we introduced several Convolutional Neural Network (CNN) models that take unstructured gene expression inputs to classify tumor and non-tumor samples into their designated cancer types or as normal. Based on different designs of gene embeddings and convolution schemes, we implemented three CNN models: 1D-CNN, 2D-Vanilla-CNN, and 2D-Hybrid-CNN. The models were trained and tested on combined 10,340 samples of 33 cancer types and 731 matched normal tissues of The Cancer Genome Atlas (TCGA). Our models achieved excellent prediction accuracies (93.9-95.0 Furthermore, we interpreted one of the models, known as 1D-CNN model, with a guided saliency technique and identified a total of 2,090 cancer markers (108 per class). The concordance of differential expression of these markers between the cancer type they represent and others is confirmed. In breast cancer, for instance, our model identified well-known markers, such as GATA3 and ESR1. Finally, we extended the 1D-CNN model for prediction of breast cancer subtypes and achieved an average accuracy of 88.42 found at https://github.com/chenlabgccri/CancerTypePrediction.

READ FULL TEXT

page 23

page 24

page 32

page 33

research
05/26/2019

Utilizing Automated Breast Cancer Detection to Identify Spatial Distributions of Tumor Infiltrating Lymphocytes in Invasive Breast Cancer

Quantitative assessment of Tumor-TIL spatial relationships is increasing...
research
09/09/2019

OncoNetExplainer: Explainable Predictions of Cancer Types Based on Gene Expression Data

The discovery of important biomarkers is a significant step towards unde...
research
06/09/2023

Contrastive Learning for Predicting Cancer Prognosis Using Gene Expression Values

Several artificial neural networks (ANNs) have recently been developed a...
research
01/17/2019

Learning a Generative Model of Cancer Metastasis

We introduce a Unified Disentanglement Network (UFDN) trained on The Can...
research
10/18/2019

The TCGA Meta-Dataset Clinical Benchmark

Machine learning is bringing a paradigm shift to healthcare by changing ...
research
04/17/2020

Identification of deregulated transcription factors involved in subtypes of cancers

We propose a methodology for the identification of transcription factors...

Please sign up or login with your details

Forgot password? Click here to reset