HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle

07/12/2022
by   Guoxia Wang, et al.
0

Accurate protein structure prediction can significantly accelerate the development of life science. The accuracy of AlphaFold2, a frontier end-to-end structure prediction system, is already close to that of the experimental determination techniques. Due to the complex model architecture and large memory consumption, it requires lots of computational resources and time to implement the training and inference of AlphaFold2 from scratch. The cost of running the original AlphaFold2 is expensive for most individuals and institutions. Therefore, reducing this cost could accelerate the development of life science. We implement AlphaFold2 using PaddlePaddle, namely HelixFold, to improve training and inference speed and reduce memory consumption. The performance is improved by operator fusion, tensor fusion, and hybrid parallelism computation, while the memory is optimized through Recompute, BFloat16, and memory read/write in-place. Compared with the original AlphaFold2 (implemented by Jax) and OpenFold (implemented by PyTorch), HelixFold needs only 7.5 days to complete the full end-to-end training and only 5.3 days when using hybrid parallelism, while both AlphaFold2 and OpenFold take about 11 days. HelixFold saves 1x training time. We verified that HelixFold's accuracy could be on par with AlphaFold2 on the CASP14 and CAMEO datasets. HelixFold's code is available on GitHub for free download: https://github.com/PaddlePaddle/PaddleHelix/tree/dev/apps/protein_folding/helixfold, and we also provide stable web services on https://paddlehelix.baidu.com/app/drug/protein/forecast.

READ FULL TEXT
research
11/01/2022

Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism

The accuracy of AlphaFold2, a frontier end-to-end structure prediction s...
research
03/02/2022

FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours

Protein structure prediction is an important method for understanding ge...
research
07/28/2022

HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

AI-based protein structure prediction pipelines, such as AlphaFold2, hav...
research
06/29/2018

XGBoost: Scalable GPU Accelerated Learning

We describe the multi-GPU gradient boosting algorithm implemented in the...
research
05/03/2021

OpTorch: Optimized deep learning architectures for resource limited environments

Deep learning algorithms have made many breakthroughs and have various a...
research
06/15/2022

Variable Bitrate Neural Fields

Neural approximations of scalar and vector fields, such as signed distan...
research
03/11/2023

Enhanced K-Radar: Optimal Density Reduction to Improve Detection Performance and Accessibility of 4D Radar Tensor-based Object Detection

Recent works have shown the superior robustness of four-dimensional (4D)...

Please sign up or login with your details

Forgot password? Click here to reset