Looking Through Glass: Knowledge Discovery from Materials Science Literature using Natural Language Processing

01/05/2021
by   Vineeth Venugopal, et al.
0

Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses' literature. The abstracts are automatically categorized using latent Dirichlet allocation (LDA), providing a way to classify and search semantically linked publications. Similarly, a comprehensive summary of images and plots are presented using the 'Caption Cluster Plot' (CCP), which provides direct access to the images buried in the papers. Finally, we combine the LDA and CCP with the chemical elements occurring in the manuscript to present an 'Elemental map', a topical and image-wise distribution of chemical elements in the literature. Overall, the framework presented here can be a generic and powerful tool to extract and disseminate material-specific information on composition-structure-processing-property dataspaces, allowing insights into fundamental problems relevant to the materials science community and accelerated materials discovery.

READ FULL TEXT

page 7

page 8

page 10

page 12

page 13

research
09/27/2022

A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

The ever-increasing number of materials science articles makes it hard t...
research
09/30/2021

MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction

An overwhelmingly large amount of knowledge in the materials domain is g...
research
02/11/2023

MatKB: Semantic Search for Polycrystalline Materials Synthesis Procedures

In this paper, we present a novel approach to knowledge extraction and r...
research
03/19/2021

EXSCLAIM! – An automated pipeline for the construction of labeled materials imaging datasets from literature

Due to recent improvements in image resolution and acquisition speed, ma...
research
07/28/2023

Lessons in Reproducibility: Insights from NLP Studies in Materials Science

Natural Language Processing (NLP), a cornerstone field within artificial...
research
02/09/2023

Flexible, Model-Agnostic Method for Materials Data Extraction from Text Using General Purpose Language Models

Accurate and comprehensive material databases extracted from research pa...

Please sign up or login with your details

Forgot password? Click here to reset