Large-scale ligand-based virtual screening for SARS-CoV-2 inhibitors using deep neural networks

by   Markus Hofmarcher, et al.

Due to the current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, there is an urgent need for novel therapies and drugs. We conducted a large-scale virtual screening for small molecules that are potential CoV-2 inhibitors. To this end, we utilized "ChemAI", a deep neural network trained on more than 220M data points across 3.6M molecules from three public drug-discovery databases. With ChemAI, we screened and ranked one billion molecules from the ZINC database for favourable effects against CoV-2. We then reduced the result to the 30,000 top-ranked compounds, which are readily accessible and purchasable via the ZINC database. We provide these top-ranked compounds as a library for further screening with bioassays at



There are no comments yet.


page 1

page 2

page 3

page 4


In silico generation of novel, drug-like chemical matter using the LSTM neural network

The exploration of novel chemical spaces is one of the most important ta...

Learning Deep Architectures for Interaction Prediction in Structure-based Virtual Screening

We introduce a deep learning architecture for structure-based virtual sc...

Excited state, non-adiabatic dynamics of large photoswitchable molecules using a chemically transferable machine learning potential

Light-induced chemical processes are ubiquitous in nature and have wides...

Protein-Ligand Docking Surrogate Models: A SARS-CoV-2 Benchmark for Deep Learning Accelerated Virtual Screening

We propose a benchmark to study surrogate model accuracy for protein-lig...

A comprehensive study on the prediction reliability of graph neural networks for virtual screening

Prediction models based on deep neural networks are increasingly gaining...

Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space

Chemical space is so large that brute force searches for new interesting...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


Funding by the Institute for Machine Learning (JKU). All authors contributed equally to this work.


  • Chen et al. (2020) Chen, Y. W., Yiu, C.-P. B., and Wong, K.-Y. Prediction of the sars-cov-2 (2019-ncov) 3c-like protease (3cl pro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Research, 9, 2020.
  • Collison (2019) Collison, J. Two targets are better than one. Nature Reviews Rheumatology, 15(7):386–386, 2019.
  • Fischer et al. (2020) Fischer, A., Sellner, M., Neranjan, S., Lill, M. A., and Smieško, M. Inhibitors for novel coronavirus protease identified by virtual screening of 687 million compounds. 2020.
  • Gaulton et al. (2017) Gaulton, A., Hersey, A., Nowotka, M., Bento, A. P., Chambers, J., Mendez, D., Mutowo, P., Atkinson, F., Bellis, L. J., Cibrián-Uhalte, E., et al. The chembl database in 2017. Nucleic acids research, 45(D1):D945–D954, 2017.
  • Glantz-Gashai et al. (2017) Glantz-Gashai, Y., Meirson, T., Reuveni, E., and Samson, A. O. Virtual screening for potential inhibitors of Mcl-1 conformations sampled by normal modes, molecular dynamics, and nuclear magnetic resonance. Drug Des Devel Ther, 11:1803–1813, 2017.
  • Gorgulla et al. (2020) Gorgulla, C., Boeszoermenyi, A., Wang, Z.-F., Fischer, P. D., Coote, P., Das, K. M. P., Malets, Y. S., Radchenko, D. S., Moroz, Y. S., Scott, D. A., et al.

    An open-source drug discovery platform enables ultra-large virtual screens.

    Nature, pp. 1–8, 2020.
  • Haider et al. (2020) Haider, Z., Subhani, M. M., Farooq, M. A., Ishaq, M., Khalid, M., Khan, R. S. A., and Niazi, A. K. In silico discovery of novel inhibitors against main protease (mpro) of sars-cov-2 using pharmacophore and molecular docking based virtual screening from zinc database. 2020.
  • Hochreiter & Schmidhuber (1997) Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  • Huang et al. (2020) Huang, A., Tang, X., Wu, H., Zhang, J., Wang, W., Wang, Z., Song, L., Zhai, M.-a., Zhao, L., Yang, H., et al. Virtual screening and molecular dynamics on blockage of key drug targets as treatment for covid-19 caused by sars-cov-2. 2020.
  • Jin et al. (2020) Jin, Z., Du, X., Xu, Y., Deng, Y., Liu, M., Zhao, Y., Zhang, B., Li, X., Zhang, L., Duan, Y., et al. Structure-based drug design, virtual screening and high-throughput screening rapidly identify antiviral leads targeting covid-19. bioRxiv, 2020.
  • Kim et al. (2016) Kim, S., Thiessen, P. A., Bolton, E. E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B. A., et al. Pubchem substance and compound databases. Nucleic acids research, 44(D1):D1202–D1213, 2016.
  • Landrum (2006) Landrum, G. RDKit: Open-source cheminformatics, 2006. URL
  • Ledford (2009) Ledford, H. One drug, two targets, 2009.
  • Lim et al. (2016) Lim, L., Roy, A., and Song, J. Identification of a zika ns2b-ns3pro pocket susceptible to allosteric inhibition by small molecules including qucertin rich in edible plants. bioRxiv, 2016. doi: 10.1101/078543. URL
  • Macchiagodena et al. (2020) Macchiagodena, M., Pagliai, M., and Procacci, P. Inhibition of the main protease 3cl-pro of the coronavirus disease 19 via structure-based ligand design and molecular modeling. arXiv preprint arXiv:2002.09937, 2020.
  • Mayr et al. (2016) Mayr, A., Klambauer, G., Unterthiner, T., and Hochreiter, S. Deeptox: toxicity prediction using deep learning. Frontiers in Environmental Science, 3:80, 2016.
  • Mayr et al. (2018) Mayr, A., Klambauer, G., Unterthiner, T., Steijaert, M., Wegner, J. K., Ceulemans, H., Clevert, D.-A., and Hochreiter, S. Large-scale comparison of machine learning methods for drug target prediction on chembl. Chemical science, 9(24):5441–5451, 2018.
  • Platt et al. (1999) Platt, J. et al.

    Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.

    Advances in large margin classifiers

    , 10(3):61–74, 1999.
  • Preuer et al. (2018) Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S., and Klambauer, G. Fréchet chemnet distance: a metric for generative models for molecules in drug discovery. Journal of chemical information and modeling, 58(9):1736–1741, 2018.
  • Preuer et al. (2019) Preuer, K., Klambauer, G., Rippmann, F., Hochreiter, S., and Unterthiner, T. Interpretable deep learning in drug discovery. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp. 331–345. Springer, 2019.
  • Ruan et al. (2020) Ruan, Z., Liu, C., Guo, Y., He, Z., Huang, X., Jia, X., and Yang, T. Potential inhibitors targeting rna-dependent rna polymerase activity (nsp12) of sars-cov-2. 2020.
  • Senathilake et al. (2020) Senathilake, K., Samarakoon, S., and Tennekoon, K. Virtual screening of inhibitors against spike glycoprotein of 2019 novel corona virus: a drug repurposing approach. 2020.
  • Sterling & Irwin (2015) Sterling, T. and Irwin, J. J. Zinc 15–ligand discovery for everyone. Journal of chemical information and modeling, 55(11):2324–2337, 2015.
  • Ton et al. (2020) Ton, A.-T., Gentile, F., Hsing, M., Ban, F., and Cherkasov, A. Rapid identification of potential inhibitors of sars-cov-2 main protease by deep docking of 1.3 billion compounds. Molecular Informatics, 2020.
  • Wang et al. (2020) Wang, Q., Zhao, Y., Chen, X., and Hong, A. Virtual screening of approved clinic drugs with main protease (3clpro) reveals potential inhibitory effects on sars-cov-2. 2020.
  • Weininger (1988) Weininger, D. SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36, 1988.
  • Wu et al. (2020) Wu, C., Liu, Y., Yang, Y., Zhang, P., Zhong, W., Wang, Y., Wang, Q., Xu, Y., Li, M., Li, X., Zheng, M., Chen, L., and Li, H. Analysis of therapeutic targets for sars-cov-2 and discovery of potential drugs by computational methods. Acta Pharmaceutica Sinica B, 2020. ISSN 2211-3835. doi: URL
  • Wu et al. (2018) Wu, Z., Ramsundar, B., Feinberg, E., Gomes, J., Geniesse, C., Pappu, A. S., Leswing, K., and Pande, V. MoleculeNet: A benchmark for molecular machine learning. Chemical Science, 9(2):513–530, 2018. ISSN 2041-6520, 2041-6539. doi: 10.1039/C7SC02664A. URL
  • Zhang et al. (2020) Zhang, J.-J., Shen, X., Yan, Y.-M., Yan, W., and Cheng, Y.-X. Discovery of anti-sars-cov-2 agents from commercially available flavor via docking screening. 2020.
  • Zhou et al. (2020) Zhou, Y., Hou, Y., Shen, J., Huang, Y., Martin, W., and Cheng, F. Network-based drug repurposing for novel coronavirus 2019-ncov/sars-cov-2. Cell Discovery, 6(1):1–18, 2020.
  • Zhu et al. (2020) Zhu, Z., Wang, X., Yang, Y., Zhang, X., Mu, K., Shi, Y., Peng, C., Xu, Z., et al. D3similarity: A ligand-based approach for predicting drug targets and for virtual screening of active compounds against covid-19. 2020.