VoterFraud2020: a Multi-modal Dataset of Election Fraud Claims on Twitter

by   Anton Abilov, et al.

The wide spread of unfounded election fraud claims surrounding the U.S. 2020 election had resulted in undermining of trust in the election, culminating in violence inside the U.S. capitol. Under these circumstances, it is critical to understand discussions surrounding these claims on Twitter, a major platform where the claims disseminate. To this end, we collected and release the VoterFraud2020 dataset, a multi-modal dataset with 7.6M tweets and 25.6M retweets from 2.6M users related to voter fraud claims. To make this data immediately useful for a wide area of researchers, we further enhance the data with cluster labels computed from the retweet graph, user suspension status, and perceptual hashes of tweeted images. We also include in the dataset aggregated information for all external links and YouTube videos that appear in the tweets. Preliminary analyses of the data show that Twitter's ban actions mostly affected a specific community of voter fraud claim promoters, and exposes the most common URLs, images and YouTube videos shared in the data.



There are no comments yet.


page 6


ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection

In this paper we introduce ArCOV19-Rumors, an Arabic COVID-19 Twitter da...

Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech

We introduce a generic, language-independent method to collect a large p...

The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic

Fact checking by professionals is viewed as a vital defense in the fight...

Claim Detection in Biomedical Twitter Posts

Social media contains unfiltered and unique information, which is potent...

"I Won the Election!": An Empirical Analysis of Soft Moderation Interventions on Twitter

Over the past few years, there is a heated debate and serious public con...

Analysing the Extent of Misinformation in Cancer Related Tweets

Twitter has become one of the most sought after places to discuss a wide...

Code Repositories


A multi-modal Twitter dataset with 7.6M tweets and 25.6M retweets related to voter fraud claims.

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.