Leveraging deep learning for fully automated NMR protein structure determination
Nuclear Magnetic Resonance (NMR) spectroscopy is one of the major techniques in structural biology with over 11800 protein structures deposited in the Protein Data Bank. NMR can elucidate structures and dynamics of small and medium size proteins in solution, living cells, and solids, but has been limited by the tedious data analysis process. It typically requires weeks or months of manual work of trained expert to turn NMR measurements into a protein structure. Automation of this process is an open problem, formulated in the field over 30 years ago. Here, we present the first approach that addresses this challenge. Our method, ARTINA, uses as input only NMR spectra and the protein sequence, delivering a structure strictly without any human intervention. Tested on a 100-protein benchmark (1329 2D/3D/4D NMR spectra), ARTINA demonstrated its ability to solve structures with 1.44 Å median RMSD to the PDB reference and 91.36 be used by non-experts, reducing the effort for a protein structure determination by NMR essentially to the preparation of the sample and the spectra measurements.
READ FULL TEXT