TGDataset: a Collection of Over One Hundred Thousand Telegram Channels

03/09/2023
by   Massimo La Morgia, et al.
0

Telegram is one of the most popular instant messaging apps in today's digital age. In addition to providing a private messaging service, Telegram, with its channels, represents a valid medium for rapidly broadcasting content to a large audience (COVID-19 announcements), but, unfortunately, also for disseminating radical ideologies and coordinating attacks (Capitol Hill riot). This paper presents the TGDataset, a new dataset that includes 120,979 Telegram channels and over 400 million messages, making it the largest collection of Telegram channels to the best of our knowledge. After a brief introduction to the data collection process, we analyze the languages spoken within our dataset and the topic covered by English channels. Finally, we discuss some use cases in which our dataset can be extremely useful to understand better the Telegram ecosystem, as well as to study the diffusion of questionable news. In addition to the raw dataset, we released the scripts we used to analyze the dataset and the list of channels belonging to the network of a new conspiracy theory called Sabmyk.

READ FULL TEXT
research
01/23/2020

The Pushshift Telegram Dataset

Messaging platforms, especially those with a mobile focus, have become i...
research
05/10/2023

Vārta: A Large-Scale Headline-Generation Dataset for Indic Languages

We present Vārta, a large-scale multilingual dataset for headline genera...
research
05/27/2021

On the Globalization of the QAnon Conspiracy Theory Through Telegram

QAnon is a far-right conspiracy theory that became popular and mainstrea...
research
11/26/2021

Uncovering the Dark Side of Telegram: Fakes, Clones, Scams, and Conspiracy Movements

Telegram is one of the most used instant messaging apps worldwide. Some ...
research
09/24/2021

Universal Payment Channels: An Interoperability Platform for Digital Currencies

With the innovation of distributed ledger technology (DLT), often known ...
research
02/25/2021

Understanding Worldwide Private Information Collection on Android

Mobile phones enable the collection of a wealth of private information, ...
research
10/06/2020

Anubhuti – An annotated dataset for emotional analysis of Bengali short stories

Thousands of short stories and articles are being written in many differ...

Please sign up or login with your details

Forgot password? Click here to reset