MutFormer: A context-dependent transformer-based model to predict pathogenic missense mutations
A missense mutation is a point mutation that results in a substitution of an amino acid in a protein sequence. Currently, missense mutations account for approximately half of the known variants responsible for human inherited diseases, but accurate prediction of the pathogenicity of missense variants is still challenging. Recent advances in deep learning show that transformer models are particularly powerful at modeling sequences. In this study, we introduce MutFormer, a transformer-based model for prediction of pathogenic missense mutations. We pre-trained MutFormer on reference protein sequences and alternative protein sequences result from common genetic variants. We tested different fine-tuning methods for pathogenicity prediction. Our results show that MutFormer outperforms a variety of existing tools. MutFormer and pre-computed variant scores are publicly available on GitHub at https://github.com/WGLab/mutformer.
READ FULL TEXT