DeepAI AI Chat
Log In Sign Up

Predicting gender and age categories in English conversations using lexical, non-lexical, and turn-taking features

by   Andreas Liesenfeld, et al.

This paper examines gender and age salience and (stereo)typicality in British English talk with the aim to predict gender and age categories based on lexical, phrasal and turn-taking features. We examine the SpokenBNC, a corpus of around 11.4 million words of British English conversations and identify behavioural differences between speakers that are labelled for gender and age categories. We explore differences in language use and turn-taking dynamics and identify a range of characteristics that set the categories apart. We find that female speakers tend to produce more and slightly longer turns, while turns by male speakers feature a higher type-token ratio and a distinct range of minimal particles such as "eh", "uh" and "em". Across age groups, we observe, for instance, that swear words and laughter characterize young speakers' talk, while old speakers tend to produce more truncated words. We then use the observed characteristics to predict gender and age labels of speakers per conversation and per turn as a classification task, showing that non-lexical utterances such as minimal particles that are usually left out of dialog data can contribute to setting the categories apart.


Pardon the Interruption: An Analysis of Gender and Turn-Taking in U.S. Supreme Court Oral Arguments

This study presents a corpus of turn changes between speakers in U.S. Su...

Path of Vowel Raising in Chengdu Dialect of Mandarin

He and Rao (2013) reported a raising phenomenon of /a/ in /Xan/ (X being...

Pick a Fight or Bite your Tongue: Investigation of Gender Differences in Idiomatic Language Usage

A large body of research on gender-linked language has established found...

Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog

We examine a large dialog corpus obtained from the conversation history ...

Don't Take it Personally: Analyzing Gender and Age Differences in Ratings of Online Humor

Computational humor detection systems rarely model the subjectivity of h...

Entrainment profiles: Comparison by gender, role, and feature set

We examine prosodic entrainment in cooperative game dialogs for new feat...

Analysis of Male and Female Speakers' Word Choices in Public Speeches

The extent to which men and women use language differently has been ques...