On the Stability of Online Language Features: How Much Text do you Need to know a Person?

04/24/2015
by   Eben M. Haber, et al.
0

In recent years, numerous studies have inferred personality and other traits from people's online writing. While these studies are encouraging, more information is needed in order to use these techniques with confidence. How do linguistic features vary across different online media, and how much text is required to have a representative sample for a person? In this paper, we examine several large sets of online, user-generated text, drawn from Twitter, email, blogs, and online discussion forums. We examine and compare population-wide results for the linguistic measure LIWC, and the inferred traits of Big5 Personality and Basic Human Values. We also empirically measure the stability of these traits across different sized samples for each individual. Our results highlight the importance of tuning models to each online medium, and include guidelines for the minimum amount of text required for a representative result.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2019

A Corpus for Modeling User and Language Effects in Argumentation on Online Debating

Existing argumentation datasets have succeeded in allowing researchers t...
research
10/06/2018

Personality facets recognition from text

Fundamental Big Five personality traits and their facets are known to co...
research
01/25/2022

A Quantitative and Qualitative Analysis of Schizophrenia Language

Schizophrenia is one of the most disabling mental health conditions to l...
research
12/14/2022

Relationship Between Online Harmful Behaviors and Social Network Message Writing Style

In this paper, we explore the relationship between an individual's writi...
research
11/15/2017

Tracking Typological Traits of Uralic Languages in Distributed Language Representations

Although linguistic typology has a long history, computational approache...
research
10/14/2016

A Language-independent and Compositional Model for Personality Trait Recognition from Short Texts

Many methods have been used to recognize author personality traits from ...
research
03/17/2021

Inferred vs traditional personality assessment: are we predicting the same thing?

Machine learning methods are widely used by researchers to predict psych...

Please sign up or login with your details

Forgot password? Click here to reset