Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization

In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, n-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character flooding, capital letters, references to other users, hyperlinks, hashtags, etc.). Experiments with SVM showed up to 90 performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2018

Violence originated from Facebook: A case study in Bangladesh

Facebook as in social network is a great innovation of modern times. Amo...
research
06/23/2018

Temporal Activity Path Based Character Correction in Social Networks

Vast amount of multimedia data contains massive and multifarious social ...
research
08/28/2013

Text recognition in both ancient and cartographic documents

This paper deals with the recognition and matching of text in both carto...
research
10/25/2016

Improving historical spelling normalization with bi-directional LSTMs and multi-task learning

Natural-language processing of historical documents is complicated by th...
research
11/13/2021

SocialBERT – Transformers for Online SocialNetwork Language Modelling

The ubiquity of the contemporary language understanding tasks gives rele...
research
07/08/2018

Social network aided plagiarism detection: Social network aided plagiarism detection

The prevalence of different kinds of electronic devices and the volume o...
research
12/19/2021

LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features

This work aims to evaluate the ability that both probabilistic and state...

Please sign up or login with your details

Forgot password? Click here to reset