A Comparative Analysis of Distributional Term Representations for Author Profiling in Social Media

Author Profiling (AP) aims at predicting specific characteristics from a group of authors by analyzing their written documents. Many research has been focused on determining suitable features for modeling writing patterns from authors. Reported results indicate that content-based features continue to be the most relevant and discriminant features for solving this task. Thus, in this paper, we present a thorough analysis regarding the appropriateness of different distributional term representations (DTR) for the AP task. In this regard, we introduce a novel framework for supervised AP using these representations and, supported on it. We approach a comparative analysis of representations such as DOR, TCOR, SSR, and word2vec in the AP problem. We also compare the performance of the DTRs against classic approaches including popular topic-based methods. The obtained results indicate that DTRs are suitable for solving the AP task in social media domains as they achieve competitive results while providing meaningful interpretability.

READ FULL TEXT
research
05/28/2018

A visual approach for age and gender identification on Twitter

The goal of Author Profiling (AP) is to identify demographic aspects (e....
research
08/20/2019

Similarity Learning for Authorship Verification in Social Media

Authorship verification tries to answer the question if two documents wi...
research
10/05/2018

Clust-LDA: Joint Model for Text Mining and Author Group Inference

Social media corpora pose unique challenges and opportunities, including...
research
04/05/2020

Domain-based Latent Personal Analysis and its use for impersonation detection in social media

Zipf's law defines an inverse proportion between a word's ranking in a g...
research
09/30/2022

PART: Pre-trained Authorship Representation Transformer

Authors writing documents imprint identifying information within their t...
research
02/10/2019

Word embeddings for idiolect identification

The term idiolect refers to the unique and distinctive use of language o...

Please sign up or login with your details

Forgot password? Click here to reset