BanFakeNews: A Dataset for Detecting Fake News in Bangla

04/19/2020
by   Md Zobaer Hossain, et al.
0

Observing the damages that can be done by the rapid propagation of fake news in various sectors like politics and finance, automatic identification of fake news using linguistic analysis has drawn the attention of the research community. However, such methods are largely being developed for English where low resource languages remain out of the focus. But the risks spawned by fake and manipulative news are not confined by languages. In this work, we propose an annotated dataset of  50K news that can be used for building automated fake news detection systems for a low resource language like Bangla. Additionally, we provide an analysis of the dataset and develop a benchmark system with state of the art NLP techniques to identify Bangla fake news. To create this system, we explore traditional linguistic features and neural network based methods. We expect this dataset will be a valuable resource for building technologies to prevent the spreading of fake news and contribute in research with low resource languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2019

In Search of Credible News

We study the problem of finding fake online news. This is an important p...
research
02/23/2021

Factorization of Fact-Checks for Low Resource Indian Languages

The advancement in technology and accessibility of internet to each indi...
research
10/21/2019

Localization of Fake News Detection via Multitask Transfer Learning

The use of the internet as a fast medium of spreading fake news reinforc...
research
08/26/2022

Cross-lingual Transfer Learning for Fake News Detector in a Low-Resource Language

Development of methods to detect fake news (FN) in low-resource language...
research
03/22/2022

Approaches for Improving the Performance of Fake News Detection in Bangla: Imbalance Handling and Model Stacking

Imbalanced datasets can lead to biasedness into the detection of fake ne...
research
10/14/2020

No Rumours Please! A Multi-Indic-Lingual Approach for COVID Fake-Tweet Detection

The sudden widespread menace created by the present global pandemic COVI...
research
09/16/2023

RMDM: A Multilabel Fakenews Dataset for Vietnamese Evidence Verification

In this study, we present a novel and challenging multilabel Vietnamese ...

Please sign up or login with your details

Forgot password? Click here to reset