MaScQA: A Question Answering Dataset for Investigating Materials Science Knowledge of Large Language Models

08/17/2023
by   Mohd Zaki, et al.
0

Information extraction and textual comprehension from materials literature are vital for developing an exhaustive knowledge base that enables accelerated materials discovery. Language models have demonstrated their capability to answer domain-specific questions and retrieve information from knowledge bases. However, there are no benchmark datasets in the materials domain that can evaluate the understanding of the key concepts by these language models. In this work, we curate a dataset of 650 challenging questions from the materials domain that require the knowledge and skills of a materials student who has cleared their undergraduate degree. We classify these questions based on their structure and the materials science domain-based subcategories. Further, we evaluate the performance of GPT-3.5 and GPT-4 models on solving these questions via zero-shot and chain of thought prompting. It is observed that GPT-4 gives the best performance ( 62 contrast to the general observation, no significant improvement in accuracy is observed with the chain of thought prompting. To evaluate the limitations, we performed an error analysis, which revealed conceptual errors ( 64 major contributor compared to computational errors ( 36 performance of LLMs. We hope that the dataset and analysis performed in this work will promote further research in developing better materials science domain-specific LLMs and strategies for information extraction.

READ FULL TEXT

page 1

page 10

page 11

research
06/08/2021

Comprehension Based Question Answering using Bloom's Taxonomy

Current pre-trained language models have lots of knowledge, but a more l...
research
08/07/2023

KITLM: Domain-Specific Knowledge InTegration into Language Models for Question Answering

Large language models (LLMs) have demonstrated remarkable performance in...
research
02/13/2016

Science Question Answering using Instructional Materials

We provide a solution for elementary science test using instructional ma...
research
05/21/2023

TheoremQA: A Theorem-driven Question Answering dataset

The recent LLMs like GPT-4 and PaLM-2 have made tremendous progress in s...
research
07/05/2023

MuLMS-AZ: An Argumentative Zoning Dataset for the Materials Science Domain

Scientific publications follow conventionalized rhetorical structures. C...
research
09/21/2023

Benchmarking quantized LLaMa-based models on the Brazilian Secondary School Exam

Although Large Language Models (LLMs) represent a revolution in the way ...

Please sign up or login with your details

Forgot password? Click here to reset