DeepAI AI Chat
Log In Sign Up

Two-Stage Classifier for COVID-19 Misinformation Detection Using BERT: a Study on Indonesian Tweets

06/30/2022
by   Douglas Raevan Faisal, et al.
0

The COVID-19 pandemic has caused globally significant impacts since the beginning of 2020. This brought a lot of confusion to society, especially due to the spread of misinformation through social media. Although there were already several studies related to the detection of misinformation in social media data, most studies focused on the English dataset. Research on COVID-19 misinformation detection in Indonesia is still scarce. Therefore, through this research, we collect and annotate datasets for Indonesian and build prediction models for detecting COVID-19 misinformation by considering the tweet's relevance. The dataset construction is carried out by a team of annotators who labeled the relevance and misinformation of the tweet data. In this study, we propose the two-stage classifier model using IndoBERT pre-trained language model for the Tweet misinformation detection task. We also experiment with several other baseline models for text classification. The experimental results show that the combination of the BERT sequence classifier for relevance prediction and Bi-LSTM for misinformation detection outperformed other machine learning models with an accuracy of 87.02 contributes to the higher performance of most prediction models. We release a high-quality COVID-19 misinformation Tweet corpus in the Indonesian language, indicated by the high inter-annotator agreement.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/05/2022

The COVMis-Stance dataset: Stance Detection on Twitter for COVID-19 Misinformation

During the COVID-19 pandemic, large amounts of COVID-19 misinformation a...
04/28/2020

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection

Nowadays, offensive content in social media has become a serious problem...
07/12/2023

Detecting the Presence of COVID-19 Vaccination Hesitancy from South African Twitter Data Using Machine Learning

Very few social media studies have been done on South African user-gener...
12/30/2022

How would Stance Detection Techniques Evolve after the Launch of ChatGPT?

Stance detection refers to the task of extracting the standpoint (Favor,...
03/01/2021

Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models

Misinformation of COVID-19 is prevalent on social media as the pandemic ...

Code Repositories

covid19-indonesian-misinformation-tweets

None


view repo