She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models

03/17/2023
by   Christoph Treude, et al.
0

Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we identify a significant disparity in the gendered pronoun associations with different tasks. Specifically, requirements elicitation was associated with the pronoun "he" in only 6 cases, while testing was associated with "he" in 100 tasks related to helping others had a 91 association for tasks related to asking coworkers was only 52 reveal a clear pattern of gender bias related to software development tasks and have important implications for addressing this issue both in the training of large language models and in broader society.

READ FULL TEXT
research
09/17/2023

Public Perceptions of Gender Bias in Large Language Models: Cases of ChatGPT and Ernie

Large language models are quickly gaining momentum, yet are found to dem...
research
07/18/2023

Unveiling Gender Bias in Terms of Profession Across LLMs: Analyzing and Addressing Sociological Implications

Gender bias in artificial intelligence (AI) and natural language process...
research
07/18/2022

Selection Bias Induced Spurious Correlations in Large Language Models

In this work we show how large language models (LLMs) can learn statisti...
research
09/08/2022

Relationship between Gender and Code Reading Speed in Software Development

Recently, workforce shortage has become a popular issue in information t...
research
08/27/2021

Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies

Gender is widely discussed in the context of language tasks and when exa...
research
06/06/2023

MISGENDERED: Limits of Large Language Models in Understanding Pronouns

Content Warning: This paper contains examples of misgendering and erasur...
research
09/20/2023

Data-Driven Analysis of Gender Fairness in the Software Engineering Academic Landscape

Gender bias in education gained considerable relevance in the literature...

Please sign up or login with your details

Forgot password? Click here to reset