Authorship Attribution in Bangla literature using Character-level CNN

01/11/2020
by   Aisha Khatun, et al.
0

Characters are the smallest unit of text that can extract stylometric signals to determine the author of a text. In this paper, we investigate the effectiveness of character-level signals in Authorship Attribution of Bangla Literature and show that the results are promising but improvable. The time and memory efficiency of the proposed model is much higher than the word level counterparts but accuracy is 2-5 models. Comparison of various word-based models is performed and shown that the proposed model performs increasingly better with larger datasets. We also analyze the effect of pre-training character embedding of diverse Bangla character set in authorship attribution. It is seen that the performance is improved by up to 10 balancing them before training and compare the results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2023

Integrating Bidirectional Long Short-Term Memory with Subword Embedding for Authorship Attribution

The problem of unveiling the author of a given text document from multip...
research
06/06/2022

What do tokens know about their characters and how do they know it?

Pre-trained language models (PLMs) that use subword tokenization schemes...
research
12/20/2022

Character-Aware Models Improve Visual Text Rendering

Current image generation models struggle to reliably produce well-formed...
research
05/11/2021

Restoring Hebrew Diacritics Without a Dictionary

We demonstrate that it is feasible to diacritize Hebrew script without a...
research
03/10/2022

SATLab at SemEval-2022 Task 4: Trying to Detect Patronizing and Condescending Language with only Character and Word N-grams

A logistic regression model only fed with character and word n-grams is ...
research
04/29/2020

Measuring Information Propagation in Literary Social Networks

We present the task of modeling information propagation in literature, i...
research
05/09/2016

Efficiency Evaluation of Character-level RNN Training Schedules

We present four training and prediction schedules from the same characte...

Please sign up or login with your details

Forgot password? Click here to reset