Towards Constructing a Corpus for Studying the Effects of Treatments and Substances Reported in PubMed Abstracts

12/04/2019
by   Evgeni Stefchov, et al.
0

We present the construction of an annotated corpus of PubMed abstracts reporting about positive, negative or neutral effects of treatments or substances. Our ultimate goal is to annotate one sentence (rationale) for each abstract and to use this resource as a training set for text classification of effects discussed in PubMed abstracts. Currently, the corpus consists of 750 abstracts. We describe the automatic processing that supports the corpus construction, the manual annotation activities and some features of the medical language in the abstracts selected for the annotated corpus. It turns out that recognizing the terminology and the abbreviations is key for determining the rationale sentence. The corpus will be applied to improve our classifier, which currently has accuracy of 78.80 terms based on UMLS concepts from specific semantic groups and an SVM with a linear kernel. Finally, we discuss some other possible applications of this corpus.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/09/2019

BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories

In this paper, we introduce the first and largest Hindi text corpus, nam...
08/01/2020

Cross-context News Corpus for Protest Events related Knowledge Base Construction

We describe a gold standard corpus of protest events that comprise of va...
09/15/2021

The ELITR ECA Corpus

We present the ELITR ECA corpus, a multilingual corpus derived from publ...
08/28/2018

MedSTS: A Resource for Clinical Semantic Textual Similarity

The wide adoption of electronic health records (EHRs) has enabled a wide...
10/20/2020

Text Classification of COVID-19 Press Briefings using BERT and Convolutional Neural Networks

We build a sentence-level political discourse classifier using existing ...
03/10/2021

Removing Gamification: A Research Agenda

The effect of removing gamification elements from interactive systems ha...
09/17/2018

Open Subtitles Paraphrase Corpus for Six Languages

This paper accompanies the release of Opusparcus, a new paraphrase corpu...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.