Did You Really Just Have a Heart Attack? Towards Robust Detection of Personal Health Mentions in Social Media

by   Payam Karisani, et al.

Millions of users share their experiences on social media sites, such as Twitter, which in turn generate valuable data for public health monitoring, digital epidemiology, and other analyses of population health at global scale. The first, critical, task for these applications is classifying whether a personal health event was mentioned, which we call the (PHM) problem. This task is challenging for many reasons, including typically short length of social media posts, inventive spelling and lexicons, and figurative language, including hyperbole using diseases like "heart attack" or "cancer" for emphasis, and not as a health self-report. This problem is even more challenging for rarely reported, or frequent but ambiguously expressed conditions, such as "stroke". To address this problem, we propose a general, robust method for detecting PHMs in social media, which we call WESPAD, that combines lexical, syntactic, word embedding-based, and context-based features. WESPAD is able to generalize from few examples by automatically distorting the word embedding space to most effectively detect the true health mentions. Unlike previously proposed state-of-the-art supervised and deep-learning techniques, WESPAD requires relatively little training data, which makes it possible to adapt, with minimal effort, to each new disease and condition. We evaluate WESPAD on both an established publicly available Flu detection benchmark, and on a new dataset that we have constructed with mentions of multiple health conditions. The experiments show that WESPAD outperforms the baselines and state-of-the-art methods, especially in cases when the number and proportion of true health mentions in the training data is small.


page 1

page 2

page 3

page 4


Multi-task Learning for Personal Health Mention Detection on Social Media

Detecting personal health mentions on social media is essential to compl...

Domain-Guided Task Decomposition with Self-Training for Detecting Personal Events in Social Media

Mining social media content for tasks such as detecting personal experie...

Characterizing Diabetes, Diet, Exercise, and Obesity Comments on Twitter

Social media provide a platform for users to express their opinions and ...

The Healthy States of America: Creating a Health Taxonomy with Social Media

Since the uptake of social media, researchers have mined online discussi...

Theme-driven Keyphrase Extraction from Social Media on Opioid Recovery

An emerging trend on social media platforms is their use as safe spaces ...

Determining Health Utilities through Data Mining of Social Media

'Health utilities' measure patient preferences for perfect health compar...

Please sign up or login with your details

Forgot password? Click here to reset