AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB
While DeepMind has tentatively solved protein folding, its inverse problem – protein design which predicts protein sequences from their 3D structures – still faces significant challenges. Particularly, the lack of large-scale standardized benchmark and poor accuray hinder the research progress. In order to standardize comparisons and draw more research interest, we use AlphaFold DB, one of the world's largest protein structure databases, to establish a new graph-based benchmark – AlphaDesign. Based on AlphaDesign, we propose a new method called ADesign to improve accuracy by introducing protein angles as new features, using a simplified graph transformer encoder (SGT), and proposing a confidence-aware protein decoder (CPD). Meanwhile, SGT and CPD also improve model efficiency by simplifying the training and testing procedures. Experiments show that ADesign significantly outperforms previous graph models, e.g., the average accuracy is improved by 8%, and the inference speed is 40+ times faster than before.
READ FULL TEXT