Uncovering and Categorizing Social Biases in Text-to-SQL

05/25/2023
by   Yan Liu, et al.
0

Content Warning: This work contains examples that potentially implicate stereotypes, associations, and other harms that could be offensive to individuals in certain social groups. Large pre-trained language models are acknowledged to carry social biases towards different demographics, which can further amplify existing stereotypes in our society and cause even more harm. Text-to-SQL is an important task, models of which are mainly adopted by administrative industries, where unfair decisions may lead to catastrophic consequences. However, existing Text-to-SQL models are trained on clean, neutral datasets, such as Spider and WikiSQL. This, to some extent, cover up social bias in models under ideal conditions, which nevertheless may emerge in real application scenarios. In this work, we aim to uncover and categorize social biases in Text-to-SQL models. We summarize the categories of social biases that may occur in structured data for Text-to-SQL models. We build test benchmarks and reveal that models with similar task accuracy can contain social biases at very different rates. We show how to take advantage of our methodology to uncover and assess social biases in the downstream Text-to-SQL task. We will release our code and data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2023

Uncovering and Quantifying Social Biases in Code Generation

With the popularity of automatic code generation tools, such as Copilot,...
research
04/13/2023

Evaluation of Social Biases in Recent Large Pre-Trained Models

Large pre-trained language models are widely used in the community. Thes...
research
05/22/2023

Text-to-SQL Error Correction with Language Models of Code

Despite recent progress in text-to-SQL parsing, current semantic parsers...
research
03/15/2022

Evaluating the Text-to-SQL Capabilities of Large Language Models

We perform an empirical evaluation of Text-to-SQL capabilities of the Co...
research
08/29/2023

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Large language models (LLMs) have emerged as a new paradigm for Text-to-...
research
12/15/2021

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Detecting social bias in text is challenging due to nuance, subjectivity...
research
02/14/2023

A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?

As text-to-image systems continue to grow in popularity with the general...

Please sign up or login with your details

Forgot password? Click here to reset