Topology identifies emerging adaptive mutations in SARS-CoV-2

06/14/2021
by   Michael Bleher, et al.
0

The COVID-19 pandemic has lead to a worldwide effort to characterize its evolution through the mapping of mutations in the genome of the coronavirus SARS-CoV-2. Ideally, one would like to quickly identify new mutations that could confer adaptive advantages (e.g. higher infectivity or immune evasion) by leveraging the large number of genomes. One way of identifying adaptive mutations is by looking at convergent mutations, mutations in the same genomic position that occur independently. However, the large number of currently available genomes precludes the efficient use of phylogeny-based techniques. Here, we establish a fast and scalable Topological Data Analysis approach for the early warning and surveillance of emerging adaptive mutations based on persistent homology. It identifies convergent events merely by their topological footprint and thus overcomes limitations of current phylogenetic inference techniques. This allows for an unbiased and rapid analysis of large viral datasets. We introduce a new topological measure for convergent evolution and apply it to the GISAID dataset as of February 2021, comprising 303,651 high-quality SARS-CoV-2 isolates collected since the beginning of the pandemic. We find that topologically salient mutations on the receptor-binding domain appear in several variants of concern and are linked with an increase in infectivity and immune escape, and for many adaptive mutations the topological signal precedes an increase in prevalence. We show that our method effectively identifies emerging adaptive mutations at an early stage. By localizing topological signals in the dataset, we extract geo-temporal information about the early occurrence of emerging adaptive mutations. The identification of these mutations can help to develop an alert system to monitor mutations of concern and guide experimentalists to focus the study of specific circulating variants.

READ FULL TEXT
research
09/30/2022

Fast Topological Signal Identification and Persistent Cohomological Cycle Matching

Within the context of topological data analysis, the problems of identif...
research
05/01/2021

Topological Data Analysis of COVID-19 Virus Spike Proteins

Topological data analysis, including persistent homology, has undergone ...
research
02/06/2023

Topological Analysis of Temporal Hypergraphs

In this work we study the topological properties of temporal hypergraphs...
research
06/06/2023

Mathematics-assisted directed evolution and protein engineering

Directed evolution is a molecular biology technique that is transforming...
research
10/03/2021

A mixture model for determining SARS-Cov-2 variant composition in pooled samples

Despite of the fast development of highly effective vaccines to control ...
research
08/01/2022

Unsupervised machine learning framework for discriminating major variants of concern during COVID-19

Due to the rapid evolution of the SARS-CoV-2 (COVID-19) virus, a number ...
research
06/28/2016

A Topological Lowpass Filter for Quasiperiodic Signals

This article presents a two-stage topological algorithm for recovering a...

Please sign up or login with your details

Forgot password? Click here to reset