Learning Embeddings from Cancer Mutation Sets for Classification Tasks

Analysis of somatic mutation profiles from cancer patients is essential in the development of cancer research. However, the low frequency of most mutations and the varying rates of mutations across patients makes the data extremely challenging to statistically analyze as well as difficult to use in classification problems, for clustering, visualization or for learning useful information. Thus, the creation of low dimensional representations of somatic mutation profiles that hold useful information about the DNA of cancer cells will facilitate the use of such data in applications that will progress precision medicine. In this paper, we talk about the open problem of learning from somatic mutations, and present Flatsomatic: a solution that utilizes variational autoencoders (VAEs) to create latent representations of somatic profiles. The work done in this paper shows great potential for this method, with the VAE embeddings performing better than PCA for a clustering task, and performing equally well to the raw high dimensional data for a classification task. We believe the methods presented herein can be of great value in future research and in bringing data-driven models into precision oncology.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2019

Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification

Different aspects of a clinical sample can be revealed by multiple types...
research
11/27/2019

Flatsomatic: A Method for Compression of Somatic Mutation Profiles in Cancer

In this study, we present Flatsomatic - a Variational Auto Encoder (VAE)...
research
05/21/2020

Correlated Mixed Membership Modeling of Somatic Mutations

Recent studies of cancer somatic mutation profiles seek to identify muta...
research
06/22/2022

Automated Cancer Subtyping via Vector Quantization Mutual Information Maximization

Cancer subtyping is crucial for understanding the nature of tumors and p...
research
02/03/2022

SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification

For personalized medicines, very crucial intrinsic information is presen...
research
03/14/2022

Unsupervised Clustering of Roman Potsherds via Variational Autoencoders

In this paper we propose an artificial intelligence imaging solution to ...
research
05/02/2021

DRIVE: Machine Learning to Identify Drivers of Cancer with High-Dimensional Genomic Data Imputed Labels

Identifying the mutations that drive cancer growth is key in clinical de...

Please sign up or login with your details

Forgot password? Click here to reset