Toponym Identification in Epidemiology Articles -- A Deep Learning Approach

04/24/2019
by   MohammadReza Davari, et al.
0

When analyzing the spread of viruses, epidemiologists often need to identify the location of infected hosts. This information can be found in public databases, such as GenBank genebank, however, information provided in these databases are usually limited to the country or state level. More fine-grained localization information requires phylogeographers to manually read relevant scientific articles. In this work we propose an approach to automate the process of place name identification from medical (epidemiology) articles. and resolving ambiguities related to mention of geographical locations in text. detection of toponyms from medical texts. The focus of this paper is to propose a deep learning based model for toponym detection and experiment with the use of external linguistic features and domain specific information. The model was evaluated using a collection of 105 epidemiology articles from PubMed Central Weissenbacher2015 provided by the recent SemEval task 12 semeval-2019-web. Our best detection model achieves an F1 score of 80.13%, a significant improvement compared to the state of the art of 69.84%. These results underline the importance of domain specific embedding as well as specific linguistic features in toponym detection in medical journals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2016

A matter of words: NLP for quality evaluation of Wikipedia medical articles

Automatic quality evaluation of Web information is a task with many fiel...
research
09/11/2020

UPB at SemEval-2020 Task 11: Propaganda Detection with Domain-Specific Trained BERT

Manipulative and misleading news have become a commodity for some online...
research
11/25/2019

My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections

Comparative text mining extends from genre analysis and political bias d...
research
05/12/2021

Priberam at MESINESP Multi-label Classification of Medical Texts Task

Medical articles provide current state of the art treatments and diagnos...
research
07/04/2019

Collecting Indicators of Compromise from Unstructured Text of Cybersecurity Articles using Neural-Based Sequence Labelling

Indicators of Compromise (IOCs) are artifacts observed on a network or i...
research
02/25/2022

Deep neural networks for fine-grained surveillance of overdose mortality

Surveillance of drug overdose deaths relies on death certificates for id...
research
10/06/2017

Unsupervised Extraction of Representative Concepts from Scientific Literature

This paper studies the automated categorization and extraction of scient...

Please sign up or login with your details

Forgot password? Click here to reset