Mega-COV: A Billion-Scale Dataset of 65 Languages For COVID-19

05/02/2020
by   Muhammad Abdul-Mageed, et al.
0

We describe Mega-COV, a billion-scale dataset from Twitter for studying COVID-19. The dataset is diverse (covers 234 countries), longitudinal (goes as back as 2007), multilingual (comes in 65 languages), and has a significant number of location-tagged tweets ( 32M tweets). We release tweet IDs from the dataset, hoping it will be useful for studying various phenomena related to the ongoing pandemic and accelerating viable solutions to associated problems.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset