Unsupervised machine learning framework for discriminating major variants of concern during COVID-19

08/01/2022
by   Mingyue Kang, et al.
0

Due to the rapid evolution of the SARS-CoV-2 (COVID-19) virus, a number of mutations emerged with variants such as Alpha, Gamma, Delta and Omicron which created massive impact to the world economy. Unsupervised machine learning methods have the ability to compresses, characterize and visualises unlabelled data. In this paper, we present a framework that utilizes unsupervised machine learning methods that includes combination of selected dimensional reduction and clustering methods to discriminate and visualise the associations with the major COVID-19 variants based on genome sequences. The framework utilises k-mer analysis for processing the genome (RNA) sequences and compares different dimensional reduction methods, that include principal component analysis (PCA), and t-distributed stochastic neighbour embedding (t-SNE), and uniform manifold approximation projection (UMAP). Furthermore, the framework employs agglomerative hierarchical clustering methods and provides a visualisation using a dendogram. We find that the proposed framework can effectively distinguish the major variants and hence can be used for distinguishing emerging variants in the future.

READ FULL TEXT

page 8

page 9

research
12/13/2021

Unsupervised machine learning approaches to the q-state Potts model

In this paper with study phase transitions of the q-state Potts model, t...
research
05/30/2018

Recurrent Deep Embedding Networks for Genotype Clustering and Ethnicity Prediction

The understanding of variations in genome sequences assists us in identi...
research
09/12/2021

Spike2Vec: An Efficient and Scalable Embedding Approach for COVID-19 Spike Sequences

With the rapid global spread of COVID-19, more and more data related to ...
research
03/18/2021

Unsupervised Doppler Radar-Based Activity Recognition for e-healthcare

Passive radio frequency (RF) sensing and monitoring of human daily activ...
research
08/23/2021

Mutational signatures and transmissibility of SARS-CoV-2 Gamma and Lambda variants

The emergence of SARS-CoV-2 variants of concern endangers the long-term ...
research
06/09/2018

Efficient Optimization Algorithms for Robust Principal Component Analysis and Its Variants

Robust PCA has drawn significant attention in the last decade due to its...
research
06/14/2021

Topology identifies emerging adaptive mutations in SARS-CoV-2

The COVID-19 pandemic has lead to a worldwide effort to characterize its...

Please sign up or login with your details

Forgot password? Click here to reset