Demographic Dialectal Variation in Social Media: A Case Study of African-American English

08/31/2016
by   Su Lin Blodgett, et al.
0

Though dialectal language is increasingly abundant on social media, few resources exist for developing NLP tools to handle such language. We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter. We propose a distantly supervised model to identify AAE-like language from demographics associated with geo-located messages, and we verify that this language follows well-known AAE linguistic phenomena. In addition, we analyze the quality of existing language identification and dependency parsing tools on AAE-like text, demonstrating that they perform poorly on such text compared to text associated with white speakers. We also provide an ensemble classifier for language identification which eliminates this disparity and release a new corpus of tweets containing AAE-like language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2017

Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English

We highlight an important frontier in algorithmic fairness: disparity in...
research
03/16/2020

Offensive Language Identification in Greek

As offensive language has become a rising issue for online communities a...
research
07/02/2021

Language Identification of Hindi-English tweets using code-mixed BERT

Language identification of social media text has been an interesting pro...
research
04/11/2016

Shallow Parsing Pipeline for Hindi-English Code-Mixed Social Media Text

In this study, the problem of shallow parsing of Hindi-English code-mixe...
research
12/31/2016

A POS Tagger for Code Mixed Indian Social Media Text - ICON-2016 NLP Tools Contest Entry from Surukam

Building Part-of-Speech (POS) taggers for code-mixed Indian languages is...
research
06/23/2021

Gender Recognition in Informal and Formal Language Scenarios via Transfer Learning

The interest in demographic information retrieval based on text data has...
research
03/26/2018

English verb regularization in books and tweets

The English language has evolved dramatically throughout its lifespan, t...

Please sign up or login with your details

Forgot password? Click here to reset