Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values

10/14/2022
by   Yejin Bang, et al.
2

Many NLP classification tasks, such as sexism/racism detection or toxicity detection, are based on human values. Yet, human values can vary under diverse cultural conditions. Therefore, we introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command. Along with the task, we propose a practical approach that distills value-aligned knowledge from large-scale language models (LLMs) to construct value-aligned classifiers in two steps. First, we generate value-aligned training data from LLMs by prompt-based few-shot learning. Next, we fine-tune smaller classification models with the generated data for the task. Empirical results show that our VA-Models surpass multiple baselines by at least 15.56 existing text augmentation methods. We suggest that using classifiers with explicit human value input improves both inclusivity explainability in AI.

READ FULL TEXT
research
03/15/2022

The Ghost in the Machine has an American accent: value conflict in GPT-3

The alignment problem in the context of large language models must consi...
research
01/01/2023

Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits

We present Second Thought, a new learning paradigm that enables language...
research
05/17/2023

Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents

Labeling data is essential for training text classifiers but is often di...
research
04/07/2023

What does ChatGPT return about human values? Exploring value bias in ChatGPT using a descriptive value theory

There has been concern about ideological basis and possible discriminati...
research
03/25/2022

Probing Pre-Trained Language Models for Cross-Cultural Differences in Values

Language embeds information about social, cultural, and political values...
research
05/04/2023

Human Values in Multiagent Systems

One of the major challenges we face with ethical AI today is developing ...
research
06/26/2023

Are aligned neural networks adversarially aligned?

Large language models are now tuned to align with the goals of their cre...

Please sign up or login with your details

Forgot password? Click here to reset