When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

10/04/2022
by   Zhijing Jin, et al.
8

AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind – the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of rule-breaking question answering (RBQA) of cases that involve potentially permissible rule-breaking – inspired by recent moral psychology studies. Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments. MORALCOT outperforms seven existing LLMs by 6.2 capture the flexibility of the human moral mind. We also conduct a detailed error analysis to suggest directions for future work to improve AI safety using RBQA. Our data is open-sourced at https://huggingface.co/datasets/feradauto/MoralExceptQA and code at https://github.com/feradauto/MoralCoT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2021

An Explainable AI System for the Diagnosis of High Dimensional Biomedical Data

Typical state of the art flow cytometry data samples consists of measure...
research
01/19/2022

When Is It Acceptable to Break the Rules? Knowledge Representation of Moral Judgement Based on Empirical Data

One of the most remarkable things about the human moral mind is its flex...
research
05/15/2023

Large Language Model Guided Tree-of-Thought

In this paper, we introduce the Tree-of-Thought (ToT) framework, a novel...
research
06/17/2021

Immune Moral Models? Pro-Social Rule Breaking as a Moral Enhancement Approach for Ethical AI

The world is heading towards a state in which Artificial Intelligence (A...
research
04/27/2023

Appropriateness is all you need!

The strive to make AI applications "safe" has led to the development of ...
research
05/05/2023

MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic

Theory of Mind (ToM) is a critical component of intelligence, yet accura...
research
08/12/2023

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

Safety lies at the core of the development of Large Language Models (LLM...

Please sign up or login with your details

Forgot password? Click here to reset