MutFormer: A context-dependent transformer-based model to predict pathogenic missense mutations

10/27/2021
by   Theodore Jiang, et al.
0

A missense mutation is a point mutation that results in a substitution of an amino acid in a protein sequence. Currently, missense mutations account for approximately half of the known variants responsible for human inherited diseases, but accurate prediction of the pathogenicity of missense variants is still challenging. Recent advances in deep learning show that transformer models are particularly powerful at modeling sequences. In this study, we introduce MutFormer, a transformer-based model for prediction of pathogenic missense mutations. We pre-trained MutFormer on reference protein sequences and alternative protein sequences result from common genetic variants. We tested different fine-tuning methods for pathogenicity prediction. Our results show that MutFormer outperforms a variety of existing tools. MutFormer and pre-computed variant scores are publicly available on GitHub at https://github.com/WGLab/mutformer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2023

Pairing interacting protein sequences using masked language modeling

Predicting which proteins interact together from amino-acid sequences is...
research
10/10/2017

Prior Knowledge based mutation prioritization towards causal variant finding in rare disease

How do we determine the mutational effects in exome sequencing data with...
research
08/08/2023

PTransIPs: Identification of phosphorylation sites based on protein pretrained language model and Transformer

Phosphorylation is central to numerous fundamental cellular processes, i...
research
06/23/2022

ICOS Protein Expression Segmentation: Can Transformer Networks Give Better Results?

Biomarkers identify a patients response to treatment. With the recent ad...
research
05/18/2023

Vaxformer: Antigenicity-controlled Transformer for Vaccine Design Against SARS-CoV-2

The SARS-CoV-2 pandemic has emphasised the importance of developing a un...
research
11/25/2022

Synthesis Cost-Optimal Targeted Mutant Protein Libraries

Protein variant libraries produced by site-directed mutagenesis are a us...
research
11/01/2022

Machine learning can guide experimental approaches for protein digestibility estimations

Food protein digestibility and bioavailability are critical aspects in a...

Please sign up or login with your details

Forgot password? Click here to reset