Text to Insight: Accelerating Organic Materials Knowledge Extraction via Deep Learning

09/27/2021
by   Xintong Zhao, et al.
0

Scientific literature is one of the most significant resources for sharing knowledge. Researchers turn to scientific literature as a first step in designing an experiment. Given the extensive and growing volume of literature, the common approach of reading and manually extracting knowledge is too time consuming, creating a bottleneck in the research cycle. This challenge spans nearly every scientific domain. For the materials science, experimental data distributed across millions of publications are extremely helpful for predicting materials properties and the design of novel materials. However, only recently researchers have explored computational approaches for knowledge extraction primarily for inorganic materials. This study aims to explore knowledge extraction for organic materials. We built a research dataset composed of 855 annotated and 708,376 unannotated sentences drawn from 92,667 abstracts. We used named-entity-recognition (NER) with BiLSTM-CNN-CRF deep learning model to automatically extract key knowledge from literature. Early-phase results show a high potential for automated knowledge extraction. The paper presents our findings and a framework for supervised knowledge extraction that can be adapted to other scientific domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2022

Material Named Entity Recognition (MNER) for Knowledge-driven Materials Using Deep Learning Approach

The scientific literature contains a wealth of cutting-edge knowledge in...
research
03/19/2021

EXSCLAIM! – An automated pipeline for the construction of labeled materials imaging datasets from literature

Due to recent improvements in image resolution and acquisition speed, ma...
research
12/31/2018

Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks

Leveraging new data sources is a key step in accelerating the pace of ma...
research
08/18/2023

Accelerated materials language processing enabled by GPT

Materials language processing (MLP) is one of the key facilitators of ma...
research
07/05/2023

MuLMS-AZ: An Argumentative Zoning Dataset for the Materials Science Domain

Scientific publications follow conventionalized rhetorical structures. C...
research
07/28/2020

A user-centered approach to designing an experimental laboratory data platform

While automated experiments and high-throughput methods are becoming mor...
research
06/04/2020

The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain

This paper presents a new challenging information extraction task in the...

Please sign up or login with your details

Forgot password? Click here to reset