Controlled Analyses of Social Biases in Wikipedia Bios

12/31/2020
by   Anjalie Field, et al.
0

Social biases on Wikipedia, a widely-read global platform, could greatly influence public opinion. While prior research has examined man/woman gender bias in biography articles, possible influences of confounding variables limit conclusions. In this work, we present a methodology for reducing the effects of confounding variables in analyses of Wikipedia biography pages. Given a target corpus for analysis (e.g. biography pages about women), we present a method for constructing a comparison corpus that matches the target corpus in as many attributes as possible, except the target attribute (e.g. the gender of the subject). We evaluate our methodology by developing metrics to measure how well the comparison corpus aligns with the target corpus. We then examine how articles about gender and racial minorities (cisgender women, non-binary people, transgender women, and transgender men; African American, Asian American, and Hispanic/Latinx American people) differ from other articles, including analyses driven by social theories like intersectionality. In addition to identifying suspect social biases, our results show that failing to control for confounding variables can result in different conclusions and mask biases. Our contributions include methodology that facilitates further analyses of bias in Wikipedia articles, findings that can aid Wikipedia editors in reducing biases, and framework and evaluation metrics to guide future work in this area.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2020

Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia

Specific lexical choices in how people are portrayed both reflect the wr...
research
05/05/2022

Theories of "Gender" in NLP Bias Research

The rise of concern around Natural Language Processing (NLP) technologie...
research
06/03/2021

Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia

Human activities can be seen as sequences of events, which are crucial t...
research
09/07/2019

Investigating Sports Commentator Bias within a Large Corpus of American Football Broadcasts

Sports broadcasters inject drama into play-by-play commentary by buildin...
research
09/21/2023

How-to Guides for Specific Audiences: A Corpus and Initial Findings

Instructional texts for specific target groups should ideally take into ...
research
12/10/2019

GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies

We introduce GeBioToolkit, a tool for extracting multilingual parallel c...
research
06/15/2023

Wikibio: a Semantic Resource for the Intersectional Analysis of Biographical Events

Biographical event detection is a relevant task for the exploration and ...

Please sign up or login with your details

Forgot password? Click here to reset