Alignment of Language Agents

03/26/2021
by   Zachary Kenton, et al.
0

For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want. In this paper we discuss some behavioural issues for language agents, arising from accidental misspecification by the system designer. We highlight some ways that misspecification can occur and discuss some behavioural issues that could arise from misspecification, including deceptive or manipulative language, and review some approaches for avoiding these issues.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

04/30/2019

Coevo: a collaborative design platform with artificial agents

We present Coevo, an online platform that allows both humans and artific...
05/12/2020

Dynamic Models Applied to Value Learning in Artificial Intelligence

Experts in Artificial Intelligence (AI) development predict that advance...
07/19/2020

Expected Utilitarianism

We want artificial intelligence (AI) to be beneficial. This is the groun...
06/19/2018

A Reputation System for Artificial Societies

One approach to achieving artificial general intelligence (AGI) is throu...
10/12/2016

A Paradigm for Situated and Goal-Driven Language Learning

A distinguishing property of human intelligence is the ability to flexib...
06/18/2021

Facilitation of human empathy through self-disclosure of anthropomorphic agents

As AI technologies progress, social acceptance of AI agents including in...
04/23/2021

Intensional Artificial Intelligence: From Symbol Emergence to Explainable and Empathetic AI

We argue that an explainable artificial intelligence must possess a rati...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.