Feature reduction for machine learning on molecular features: The GeneScore
We present the GeneScore, a concept of feature reduction for Machine Learning analysis of biomedical data. Using expert knowledge, the GeneScore integrates different molecular data types into a single score. We show that the GeneScore is superior to a binary matrix in the classification of cancer entities from SNV, Indel, CNV, gene fusion and gene expression data. The GeneScore is a straightforward way to facilitate state-of-the-art analysis, while making use of the available scientific knowledge on the nature of molecular data features used.
READ FULL TEXT