Goal Alignment: A Human-Aware Account of Value Alignment Problem

02/02/2023
by   Malek Mechergui, et al.
0

Value alignment problems arise in scenarios where the specified objectives of an AI agent don't match the true underlying objective of its users. The problem has been widely argued to be one of the central safety problems in AI. Unfortunately, most existing works in value alignment tend to focus on issues that are primarily related to the fact that reward functions are an unintuitive mechanism to specify objectives. However, the complexity of the objective specification mechanism is just one of many reasons why the user may have misspecified their objective. A foundational cause for misalignment that is being overlooked by these works is the inherent asymmetry in human expectations about the agent's behavior and the behavior generated by the agent for the specified objective. To address this lacuna, we propose a novel formulation for the value alignment problem, named goal alignment that focuses on a few central challenges related to value alignment. In doing so, we bridge the currently disparate research areas of value alignment and human-aware planning. Additionally, we propose a first-of-its-kind interactive algorithm that is capable of using information generated under incorrect beliefs about the agent, to determine the true underlying goal of the user.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2018

Mimetic vs Anchored Value Alignment in Artificial Intelligence

"Value alignment" (VA) is considered as one of the top priorities in AI ...
research
07/20/2023

Of Models and Tin Men – a behavioural economics study of principal-agent problems in AI alignment using large-language models

AI Alignment is often presented as an interaction between a single desig...
research
07/20/2017

Pragmatic-Pedagogic Value Alignment

For an autonomous system to provide value (e.g., to customers, designers...
research
06/21/2021

Alignment Problems With Current Forecasting Platforms

We present alignment problems in current forecasting platforms, such as ...
research
04/03/2017

Brief Notes on Hard Takeoff, Value Alignment, and Coherent Extrapolated Volition

I make some basic observations about hard takeoff, value alignment, and ...
research
01/01/2019

Personal Universes: A Solution to the Multi-Agent Value Alignment Problem

AI Safety researchers attempting to align values of highly capable intel...
research
06/11/2018

An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning

Our goal is for AI systems to correctly identify and act according to th...

Please sign up or login with your details

Forgot password? Click here to reset