Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning

07/01/2019
by   Titipat Achakulvisut, et al.
0

Claims are a fundamental unit of scientific discourse. The exponential growth in the number of scientific publications makes automatic claim extraction an important problem for researchers who are overwhelmed by this information overload. Such an automated claim extraction system is useful for both manual and programmatic exploration of scientific knowledge. In this paper, we introduce an online claim extraction system and a dataset of 1,500 scientific abstracts from the biomedical domain with expert annotations for each sentence indicating whether the sentence presents a scientific claim. We compare our proposed model with several baseline models including rule-based and deep learning techniques. Our transfer learning approach with a fine-tuning step allows us to bootstrap from a large discourse-annotated dataset (Pubmed-RCT) and obtains F1-score over 0.78 for claim detection while using a small annotated dataset of 750 papers. We show that using this pre-trained model based on the discourse prediction task improves F1-score by over 14 percent absolute points compared to a baseline model without discourse structure. We release a publicly accessible tool for discourse model, claim detection model, along with an annotation tool. We discuss further applications beyond Biomedical literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2019

Discourse Tagging for Scientific Evidence Extraction

The biomedical scientific literature comprises a crucial, sometimes life...
research
08/29/2019

Scientific Statement Classification over arXiv.org

We introduce a new classification task for scientific statements and rel...
research
06/15/2022

SciTweets – A Dataset and Annotation Framework for Detecting Scientific Online Discourse

Scientific topics, claims and resources are increasingly debated as part...
research
09/21/2018

Towards Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection

In an effort to assist factcheckers in the process of factchecking, we t...
research
01/28/2021

LESA: Linguistic Encapsulation and Semantic Amalgamation Based Generalised Claim Detection from Online Content

The conceptualization of a claim lies at the core of argument mining. Th...
research
11/11/2019

NegBERT: A Transfer Learning Approach for Negation Detection and Scope Resolution

Negation is an important characteristic of language, and a major compone...
research
01/26/2022

FiNCAT: Financial Numeral Claim Analysis Tool

While making investment decisions by reading financial documents, invest...

Please sign up or login with your details

Forgot password? Click here to reset