AI- and HPC-enabled Lead Generation for SARS-CoV-2: Models and Processes to Extract Druglike Molecules Contained in Natural Language Text

01/12/2021
by   Zhi Hong, et al.
14

Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A promising source of candidates for such studies is molecules that have been reported in the scientific literature to be drug-like in the context of coronavirus research. We report here on a project that leverages both human and artificial intelligence to detect references to drug-like molecules in free text. We engage non-expert humans to create a corpus of labeled text, use this labeled corpus to train a named entity recognition model, and employ the trained model to extract 10912 drug-like molecules from the COVID-19 Open Research Dataset Challenge (CORD-19) corpus of 198875 papers. Performance analyses show that our automated extraction model can achieve performance on par with that of non-expert humans.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2020

Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release

Researchers across the globe are seeking to rapidly repurpose existing d...
research
04/02/2020

Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

The recent COVID-19 pandemic has highlighted the need for rapid therapeu...
research
07/30/2021

An automated domain-independent text reading, interpreting and extracting approach for reviewing the scientific literature

It is presented here a machine learning-based (ML) natural language proc...
research
11/25/2020

RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design

De novo molecule generation often results in chemically unfeasible molec...
research
06/01/2020

Semi-Supervised Hierarchical Drug Embedding in Hyperbolic Space

Learning accurate drug representation is essential for tasks such as com...
research
09/03/2021

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

Like many scientific fields, new chemistry literature has grown at a sta...

Please sign up or login with your details

Forgot password? Click here to reset