Inference of Fine-grained Attributes of Bengali Corpus for Stylometry Detection

10/13/2012
by   Tanmoy Chakraborty, et al.
0

Stylometry, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and belongs to the core task of Text categorization that involves authorship identification, plagiarism detection, forensic investigation, computer security, copyright and estate disputes etc. In this work, we present a strategy for stylometry detection of documents written in Bengali. We adopt a set of fine-grained attribute features with a set of lexical markers for the analysis of the text and use three semi-supervised measures for making decisions. Finally, a majority voting approach has been taken for final classification. The system is fully automatic and language-independent. Evaluation results of our attempt for Bengali author's stylometry detection show reasonably promising accuracy in comparison to the baseline model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2019

Albanian Language Identification in Text Documents

In this work we investigate the accuracy of standard and state-of-the-ar...
research
10/17/2013

Fine-grained Categorization -- Short Summary of our Entry for the ImageNet Challenge 2012

In this paper, we tackle the problem of visual categorization of dog bre...
research
04/30/2018

A Portuguese Native Language Identification Dataset

In this paper we present NLI-PT, the first Portuguese dataset compiled f...
research
05/23/2023

Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarization

Existing factual consistency evaluation approaches for text summarizatio...
research
06/04/2023

CDLT: A Dataset with Concept Drift and Long-Tailed Distribution for Fine-Grained Visual Categorization

Data is the foundation for the development of computer vision, and the e...
research
05/30/2023

Majority Voting Approach to Ransomware Detection

Crypto-ransomware remains a significant threat to governments and compan...
research
12/30/2009

Writer Identification Using Inexpensive Signal Processing Techniques

We propose to use novel and classical audio and text signal-processing a...

Please sign up or login with your details

Forgot password? Click here to reset