Detecting Toxicity in News Articles: Application to Bulgarian

08/26/2019
by   Yoan Dinkov, et al.
0

Online media aim for reaching ever bigger audience and for attracting ever longer attention span. This competition creates an environment that rewards sensational, fake, and toxic news. To help limit their spread and impact, we propose and develop a news toxicity detector that can recognize various types of toxic content. While previous research primarily focused on English, here we target Bulgarian. We created a new dataset by crawling a website that for five years has been collecting Bulgarian news articles that were manually categorized into eight toxicity groups. Then we trained a multi-class classifier with nine categories: eight toxic and one non-toxic. We experimented with different representations based on ElMo, BERT, and XLM, as well as with a variety of domain-specific features. Due to the small size of our dataset, we created a separate model for each feature type, and we ultimately combined these models into a meta-classifier. The evaluation results show an accuracy of 59.0 the majority-class baseline (Acc=30.3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2023

Tackling Fake News in Bengali: Unraveling the Impact of Summarization vs. Augmentation on Pre-trained Language Models

With the rise of social media and online news sources, fake news has bec...
research
10/05/2019

A Machine Learning Analysis of the Features in Deceptive and Credible News

Fake news is a type of pervasive propaganda that spreads misinformation ...
research
01/02/2017

Stance detection in online discussions

This paper describes our system created to detect stance in online discu...
research
05/05/2022

RaFoLa: A Rationale-Annotated Corpus for Detecting Indicators of Forced Labour

Forced labour is the most common type of modern slavery, and it is incre...
research
04/03/2017

Combining Lexical and Syntactic Features for Detecting Content-dense Texts in News

Content-dense news report important factual information about an event i...
research
08/27/2018

Models for Predicting Community-Specific Interest in News Articles

In this work, we ask two questions: 1. Can we predict the type of commun...
research
06/11/2018

How Curiosity can be modeled for a Clickbait Detector

The impact of continually evolving digital technologies and the prolifer...

Please sign up or login with your details

Forgot password? Click here to reset