Who Is Missing? Characterizing the Participation of Different Demographic Groups in a Korean Nationwide Daily Conversation Corpus

04/20/2022
by   Haewoon Kwak, et al.
5

A conversation corpus is essential to build interactive AI applications. However, the demographic information of the participants in such corpora is largely underexplored mainly due to the lack of individual data in many corpora. In this work, we analyze a Korean nationwide daily conversation corpus constructed by the National Institute of Korean Language (NIKL) to characterize the participation of different demographic (age and sex) groups in the corpus.

READ FULL TEXT
research
02/24/2020

Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition

Existing research on fairness evaluation of document classification mode...
research
08/29/2022

Evolving Label Usage within Generation Z when Self-Describing Sexual Orientation

Evaluating change in ranked term importance in a growing corpus is a pow...
research
08/05/2020

Designing the Business Conversation Corpus

While the progress of machine translation of written text has come far i...
research
04/19/2017

A Large Self-Annotated Corpus for Sarcasm

We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for...
research
05/19/2022

On Demographic Bias in Fingerprint Recognition

Fingerprint recognition systems have been deployed globally in numerous ...
research
09/25/2019

Developing a Fine-Grained Corpus for a Less-resourced Language: the case of Kurdish

Kurdish is a less-resourced language consisting of different dialects wr...
research
08/31/2020

Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset for Personality Assessment

Automatically detecting personality traits can aid several applications,...

Please sign up or login with your details

Forgot password? Click here to reset