DeepAI
Log In Sign Up

Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?

06/02/2021
by   Jieyu Zhao, et al.
10

Is it possible to use natural language to intervene in a model's behavior and alter its prediction in a desired way? We investigate the effectiveness of natural language interventions for reading-comprehension systems, studying this in the context of social stereotypes. Specifically, we propose a new language understanding task, Linguistic Ethical Interventions (LEI), where the goal is to amend a question-answering (QA) model's unethical behavior by communicating context-specific principles of ethics and equity to it. To this end, we build upon recent methods for quantifying a system's social stereotypes, augmenting them with different kinds of ethical interventions and the desired model behavior under such interventions. Our zero-shot evaluation finds that even today's powerful neural language models are extremely poor ethical-advice takers, that is, they respond surprisingly little to ethical interventions even though these interventions are stated as simple sentences. Few-shot learning improves model behavior but remains far from the desired outcome, especially when evaluated for various types of generalization. Our new task thus poses a novel language understanding challenge for the community.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/12/2022

AiSocrates: Towards Answering Ethical Quandary Questions

Considerable advancements have been made in various NLP tasks based on t...
10/27/2022

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?

Text-to-image generative models have achieved unprecedented success in g...
03/16/2021

Robustly Optimized and Distilled Training for Natural Language Understanding

In this paper, we explore multi-task learning (MTL) as a second pretrain...
09/15/2022

Machine Reading, Fast and Slow: When Do Models "Understand" Language?

Two of the most fundamental challenges in Natural Language Understanding...
12/16/2021

DREAM: Uncovering Mental Models behind Language Models

To what extent do language models (LMs) build "mental models" of a scene...
09/19/2022

One of Many: Assessing User-level Effects of Moderation Interventions on r/The_Donald

Evaluating the effects of moderation interventions is a task of paramoun...