Ontology Driven Disease Incidence Detection on Twitter

11/21/2016
by   Mark Abraham Magumba, et al.
0

In this work we address the issue of generic automated disease incidence monitoring on twitter. We employ an ontology of disease related concepts and use it to obtain a conceptual representation of tweets. Unlike previous key word based systems and topic modeling approaches, our ontological approach allows us to apply more stringent criteria for determining which messages are relevant such as spatial and temporal characteristics whilst giving a stronger guarantee that the resulting models will perform well on new data that may be lexically divergent. We achieve this by training learners on concepts rather than individual words. For training we use a dataset containing mentions of influenza and Listeria and use the learned models to classify datasets containing mentions of an arbitrary selection of other diseases. We show that our ontological approach achieves good performance on this task using a variety of Natural Language Processing Techniques. We also show that word vectors can be learned directly from our concepts to achieve even better results.

READ FULL TEXT
research
11/15/2019

Using natural language processing to extract health-related causality from Twitter messages

Twitter messages (tweets) contain various types of information, which in...
research
04/17/2023

Use of social media and Natural Language Processing (NLP) in natural hazard research

Twitter is a microblogging service for sending short, public text messag...
research
11/21/2019

How Do You #relax When You're #stressed? A Content Analysis and Infodemiology Study of Stress-Related Tweets

Background: Stress is a contributing factor to many major health problem...
research
09/23/2018

Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach

Toxic online content has become a major issue in today's world due to an...
research
05/31/2017

Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning

Distributional word representation methods exploit word co-occurrences t...
research
04/09/2019

Mixing syntagmatic and paradigmatic information for concept detection

In the last decades, philosophers have begun using empirical data for co...
research
10/20/2015

A latent shared-component generative model for real-time disease surveillance using Twitter data

Exploiting the large amount of available data for addressing relevant so...

Please sign up or login with your details

Forgot password? Click here to reset