Training Socially Aligned Language Models in Simulated Human Society

05/26/2023
by   Ruibo Liu, et al.
7

Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attacks. This work presents a novel training paradigm that permits LMs to learn from simulated social interactions. In comparison to existing methodologies, our approach is considerably more scalable and efficient, demonstrating superior performance in alignment benchmarks and human evaluations. This paradigm shift in the training of LMs brings us a step closer to developing AI systems that can robustly and accurately reflect societal norms and values.

READ FULL TEXT

page 2

page 4

page 17

research
05/09/2022

Aligned with Whom? Direct and social goals for AI systems

As artificial intelligence (AI) becomes more powerful and widespread, th...
research
01/01/2023

Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits

We present Second Thought, a new learning paradigm that enables language...
research
08/01/2023

SurveyLM: A platform to explore emerging value perspectives in augmented language models' behaviors

This white paper presents our work on SurveyLM, a platform for analyzing...
research
05/04/2023

Human Values in Multiagent Systems

One of the major challenges we face with ethical AI today is developing ...
research
08/05/2020

Aligning AI With Shared Human Values

We show how to assess a language model's knowledge of basic concepts of ...
research
09/12/2023

Do Generative Large Language Models need billions of parameters?

This paper presents novel systems and methodologies for the development ...
research
10/06/2022

A Human Rights-Based Approach to Responsible AI

Research on fairness, accountability, transparency and ethics of AI-base...

Please sign up or login with your details

Forgot password? Click here to reset