A large-scale Twitter dataset for drug safety applications mined from publicly existing resources

03/31/2020
by   Ramya Tekumalla, et al.
0

With the increase in popularity of deep learning models for natural language processing (NLP) tasks, in the field of Pharmacovigilance, more specifically for the identification of Adverse Drug Reactions (ADRs), there is an inherent need for large-scale social-media datasets aimed at such tasks. With most researchers allocating large amounts of time to crawl Twitter or buying expensive pre-curated datasets, then manually annotating by humans, these approaches do not scale well as more and more data keeps flowing in Twitter. In this work we re-purpose a publicly available archived dataset of more than 9.4 billion Tweets with the objective of creating a very large dataset of drug usage-related tweets. Using existing manually curated datasets from the literature, we then validate our filtered tweets for relevance using machine learning methods, with the end result of a publicly available dataset of 1,181,993 million tweets for public use. We provide all code and detailed procedure on how to extract this dataset and the selected tweet ids for researchers to use.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2018

TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets

Publicly available social media archives facilitate research in a variet...
research
04/07/2021

HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks

Social networks are widely used for information consumption and dissemin...
research
06/29/2020

Is Japanese gendered language used on Twitter ? A large scale study

This study analyzes the usage of Japanese gendered language on Twitter. ...
research
08/03/2021

Predicting Zip Code-Level Vaccine Hesitancy in US Metropolitan Areas Using Machine Learning Models on Public Tweets

Although the recent rise and uptake of COVID-19 vaccines in the United S...
research
03/08/2020

Utilizing Deep Learning to Identify Drug Use on Twitter Data

The collection and examination of social media has become a useful mecha...
research
10/08/2016

Mining the Web for Pharmacovigilance: the Case Study of Duloxetine and Venlafaxine

Adverse reactions caused by drugs following their release into the marke...
research
05/16/2018

#phramacovigilance - Exploring Deep Learning Techniques for Identifying Mentions of Medication Intake from Twitter

Mining social media messages for health and drug related information has...

Please sign up or login with your details

Forgot password? Click here to reset