Biomarker Gene Identification for Breast Cancer Classification

11/10/2021
by   Sheetal Rajpal, et al.
0

BACKGROUND: Breast cancer has emerged as one of the most prevalent cancers among women leading to a high mortality rate. Due to the heterogeneous nature of breast cancer, there is a need to identify differentially expressed genes associated with breast cancer subtypes for its timely diagnosis and treatment. OBJECTIVE: To identify a small gene set for each of the four breast cancer subtypes that could act as its signature, the paper proposes a novel algorithm for gene signature identification. METHODS: The present work uses interpretable AI methods to investigate the predictions made by the deep neural network employed for subtype classification to identify biomarkers using the TCGA breast cancer RNA Sequence data. RESULTS: The proposed algorithm led to the discovery of a set of 43 differentially expressed gene signatures. We achieved a competitive average 10-fold accuracy of 0.91, using neural network classifier. Further, gene set analysis revealed several relevant pathways, such as GRB7 events in ERBB2 and p53 signaling pathway. Using the Pearson correlation matrix, we noted that the subtype-specific genes are correlated within each subtype. CONCLUSIONS: The proposed technique enables us to find a concise and clinically relevant gene signature set.

READ FULL TEXT

page 5

page 7

page 8

research
11/06/2021

Deep Learning Based Model for Breast Cancer Subtype Classification

Breast cancer has long been a prominent cause of mortality among women. ...
research
06/16/2018

CAMIRADA: Cancer microRNA association discovery algorithm, a case study on breast cancer

In recent studies, non-coding protein RNAs have been identified as micro...
research
02/14/2020

Biological Random Walks: integrating heterogeneous data in disease gene prioritization

This work proposes a unified framework to leverage biological informatio...
research
05/04/2018

Mixture Envelope Model for Heterogeneous Genomics Data Analysis

Envelope model also known as multivariate regression model was proposed ...
research
03/03/2016

A novel and automatic pectoral muscle identification algorithm for mediolateral oblique (MLO) view mammograms using ImageJ

Pectoral muscle identification is often required for breast cancer risk ...
research
01/08/2013

Fuzzy Soft Set Based Classification for Gene Expression Data

Classification is one of the major issues in Data Mining Research fields...
research
10/10/2017

An Extension of Deep Pathway Analysis: A Pathway Route Analysis Framework Incorporating Multi-dimensional Cancer Genomics Data

Recent breakthroughs in cancer research have come via the up-and-coming ...

Please sign up or login with your details

Forgot password? Click here to reset