BERT-based Authorship Attribution on the Romanian Dataset called ROST

01/29/2023
by   Sanda Maria Avram, et al.
0

Being around for decades, the problem of Authorship Attribution is still very much in focus currently. Some of the more recent instruments used are the pre-trained language models, the most prevalent being BERT. Here we used such a model to detect the authorship of texts written in the Romanian language. The dataset used is highly unbalanced, i.e., significant differences in the number of texts per author, the sources from which the texts were collected, the time period in which the authors lived and wrote these texts, the medium intended to be read (i.e., paper or online), and the type of writing (i.e., stories, short stories, fairy tales, novels, literary articles, and sketches). The results are better than expected, sometimes exceeding 87% macro-accuracy.

READ FULL TEXT

page 4

page 12

research
11/10/2022

BERT in Plutarch's Shadows

The extensive surviving corpus of the ancient scholar Plutarch of Chaero...
research
12/26/2018

An Investigation of Supervised Learning Methods for Authorship Attribution in Short Hinglish Texts using Char & Word N-grams

The writing style of a person can be affirmed as a unique identity indic...
research
11/09/2022

A comparison of several AI techniques for authorship attribution on Romanian texts

Determining the author of a text is a difficult task. Here we compare mu...
research
09/14/2022

On the State of the Art in Authorship Attribution and Authorship Verification

Despite decades of research on authorship attribution (AA) and authorshi...
research
05/10/2023

Enriching language models with graph-based context information to better understand textual data

A considerable number of texts encountered daily are somehow connected w...
research
07/19/2022

Can You Fool AI by Doing a 180? x2013 A Case Study on Authorship Analysis of Texts by Arata Osada

This paper is our attempt at answering a twofold question covering the a...

Please sign up or login with your details

Forgot password? Click here to reset