Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release

05/28/2020
by   Yadu Babuji, et al.
57

Researchers across the globe are seeking to rapidly repurpose existing drugs or discover new drugs to counter the the novel coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). One promising approach is to train machine learning (ML) and artificial intelligence (AI) tools to screen large numbers of small molecules. As a contribution to that effort, we are aggregating numerous small molecules from a variety of sources, using high-performance computing (HPC) to computer diverse properties of those molecules, using the computed properties to train ML/AI models, and then using the resulting models for screening. In this first data release, we make available 23 datasets collected from community sources representing over 4.2 B molecules enriched with pre-computed: 1) molecular fingerprints to aid similarity searches, 2) 2D images of molecules to enable exploration and application of image-based deep learning methods, and 3) 2D and 3D molecular descriptors to speed development of machine learning models. This data release encompasses structural information on the 4.2 B molecules and 60 TB of pre-computed data. Future releases will expand the data to include more detailed molecular simulations, computed models, and other products.

READ FULL TEXT

page 4

page 7

research
02/10/2021

Artificial Intelligence based Autonomous Molecular Design for Medical Therapeutic: A Perspective

Domain-aware machine learning (ML) models have been increasingly adopted...
research
01/12/2021

AI- and HPC-enabled Lead Generation for SARS-CoV-2: Models and Processes to Extract Druglike Molecules Contained in Natural Language Text

Researchers worldwide are seeking to repurpose existing drugs or discove...
research
06/03/2023

Mitigating Molecular Aggregation in Drug Discovery with Predictive Insights from Explainable AI

As the importance of high-throughput screening (HTS) continues to grow d...
research
12/22/2022

Realizing Molecular Machine Learning through Communications for Biological AI: Future Directions and Challenges

Artificial Intelligence (AI) and Machine Learning (ML) are weaving their...
research
09/03/2021

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

Like many scientific fields, new chemistry literature has grown at a sta...
research
05/01/2021

Combating small molecule aggregation with machine learning

Biological screens are plagued by false positive hits resulting from agg...
research
03/31/2022

SELFIES and the future of molecular string representations

Artificial intelligence (AI) and machine learning (ML) are expanding in ...

Please sign up or login with your details

Forgot password? Click here to reset