When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People?

04/25/2020
by   Kenneth Joseph, et al.
0

Social biases are encoded in word embeddings. This presents a unique opportunity to study society historically and at scale, and a unique danger when embeddings are used in downstream applications. Here, we investigate the extent to which publicly-available word embeddings accurately reflect beliefs about certain kinds of people as measured via traditional survey methods. We find that biases found in word embeddings do, on average, closely mirror survey data across seventeen dimensions of social meaning. However, we also find that biases in embeddings are much more reflective of survey data for some dimensions of meaning (e.g. gender) than others (e.g. race), and that we can be highly confident that embedding-based measures reflect survey data only for the most salient biases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2020

Joint Multiclass Debiasing of Word Embeddings

Bias in Word Embeddings has been a subject of recent interest, along wit...
research
05/18/2020

Grammatical gender associations outweigh topical gender bias in crosslinguistic word embeddings

Recent research has demonstrated that vector space models of semantics c...
research
10/27/2020

Discovering and Interpreting Conceptual Biases in Online Communities

Language carries implicit human biases, functioning both as a reflection...
research
11/24/2020

Unequal Representations: Analyzing Intersectional Biases in Word Embeddings Using Representational Similarity Analysis

We present a new approach for detecting human-like social biases in word...
research
07/14/2022

A tool to overcome technical barriers for bias assessment in human language technologies

Automatic processing of language is becoming pervasive in our lives, oft...
research
05/21/2023

Measuring Intersectional Biases in Historical Documents

Data-driven analyses of biases in historical texts can help illuminate t...
research
02/10/2019

Word embeddings for idiolect identification

The term idiolect refers to the unique and distinctive use of language o...

Please sign up or login with your details

Forgot password? Click here to reset