SaRoCo: Detecting Satire in a Novel Romanian Corpus of News Articles

05/13/2021
by   Ana-Cristina Rogoz, et al.
0

In this work, we introduce a corpus for satire detection in Romanian news. We gathered 55,608 public news articles from multiple real and satirical news sources, composing one of the largest corpora for satire detection regardless of language and the only one for the Romanian language. We provide an official split of the text samples, such that training news articles belong to different sources than test news articles, thus ensuring that models do not achieve high performance simply due to overfitting. We conduct experiments with two state-of-the-art deep neural models, resulting in a set of strong baselines for our novel corpus. Our results show that the machine-level accuracy for satire detection in Romanian is quite low (under 73 human-level accuracy (87 research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2018

Detecting Satire in the News with Machine Learning

We built models with Logistic Regression and linear Support Vector Machi...
research
01/24/2018

Understanding news story chains using information retrieval and network clustering techniques

Content analysis of news stories (whether manual or automatic) is a corn...
research
11/06/2021

Distinguishing Commercial from Editorial Content in News

How can we distinguish commercial from editorial content in news, or mor...
research
08/13/2021

MIND - Mainstream and Independent News Documents Corpus

This paper presents and characterizes MIND, a new Portuguese corpus comp...
research
09/10/2021

Controlled Neural Sentence-Level Reframing of News Articles

Framing a news article means to portray the reported event from a specif...
research
11/21/2019

An Empirical Study of Sections in Classifying Disease Outbreak Reports

Identifying articles that relate to infectious diseases is a necessary s...
research
04/07/2018

Quootstrap: Scalable Unsupervised Extraction of Quotation-Speaker Pairs from Large News Corpora via Bootstrapping

We propose Quootstrap, a method for extracting quotations, as well as th...

Please sign up or login with your details

Forgot password? Click here to reset