SurveyLM: A platform to explore emerging value perspectives in augmented language models' behaviors

by   Steve J. Bickley, et al.

This white paper presents our work on SurveyLM, a platform for analyzing augmented language models' (ALMs) emergent alignment behaviors through their dynamically evolving attitude and value perspectives in complex social contexts. Social Artificial Intelligence (AI) systems, like ALMs, often function within nuanced social scenarios where there is no singular correct response, or where an answer is heavily dependent on contextual factors, thus necessitating an in-depth understanding of their alignment dynamics. To address this, we apply survey and experimental methodologies, traditionally used in studying social behaviors, to evaluate ALMs systematically, thus providing unprecedented insights into their alignment and emergent behaviors. Moreover, the SurveyLM platform leverages the ALMs' own feedback to enhance survey and experiment designs, exploiting an underutilized aspect of ALMs, which accelerates the development and testing of high-quality survey frameworks while conserving resources. Through SurveyLM, we aim to shed light on factors influencing ALMs' emergent behaviors, facilitate their alignment with human intentions and expectations, and thereby contributed to the responsible development and deployment of advanced social AI systems. This white paper underscores the platform's potential to deliver robust results, highlighting its significance to alignment research and its implications for future social AI systems.


page 1

page 2

page 3

page 4


Towards Healthy AI: Large Language Models Need Therapists Too

Recent advances in large language models (LLMs) have led to the developm...

Methodological reflections for AI alignment research using human feedback

The field of artificial intelligence (AI) alignment aims to investigate ...

Training Socially Aligned Language Models in Simulated Human Society

Social alignment in AI systems aims to ensure that these models behave a...

Decolonial AI Alignment: Viśesadharma, Argument, and Artistic Expression

Prior work has explicated the coloniality of artificial intelligence (AI...

TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models

Large Language Models (LLMs) such as ChatGPT, have gained significant at...

Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

The emergence of Large Language Models (LLMs) has brought both excitemen...

White paper: The Helix Pathogenicity Prediction Platform

In this white paper we introduce Helix, an AI based solution for missens...

Please sign up or login with your details

Forgot password? Click here to reset