Towards Constructing a Corpus for Studying the Effects of Treatments and Substances Reported in PubMed Abstracts

by   Evgeni Stefchov, et al.

We present the construction of an annotated corpus of PubMed abstracts reporting about positive, negative or neutral effects of treatments or substances. Our ultimate goal is to annotate one sentence (rationale) for each abstract and to use this resource as a training set for text classification of effects discussed in PubMed abstracts. Currently, the corpus consists of 750 abstracts. We describe the automatic processing that supports the corpus construction, the manual annotation activities and some features of the medical language in the abstracts selected for the annotated corpus. It turns out that recognizing the terminology and the abbreviations is key for determining the rationale sentence. The corpus will be applied to improve our classifier, which currently has accuracy of 78.80 terms based on UMLS concepts from specific semantic groups and an SVM with a linear kernel. Finally, we discuss some other possible applications of this corpus.



There are no comments yet.


page 1

page 2

page 3

page 4


BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories

In this paper, we introduce the first and largest Hindi text corpus, nam...

Cross-context News Corpus for Protest Events related Knowledge Base Construction

We describe a gold standard corpus of protest events that comprise of va...

The ELITR ECA Corpus

We present the ELITR ECA corpus, a multilingual corpus derived from publ...

MedSTS: A Resource for Clinical Semantic Textual Similarity

The wide adoption of electronic health records (EHRs) has enabled a wide...

Text Classification of COVID-19 Press Briefings using BERT and Convolutional Neural Networks

We build a sentence-level political discourse classifier using existing ...

Removing Gamification: A Research Agenda

The effect of removing gamification elements from interactive systems ha...

Open Subtitles Paraphrase Corpus for Six Languages

This paper accompanies the release of Opusparcus, a new paraphrase corpu...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.