Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning

06/25/2023
by   Xiao Ma, et al.
0

Language models still struggle on moral reasoning, despite their impressive performance in many other tasks. In particular, the Moral Scenarios task in MMLU (Multi-task Language Understanding) is among the worst performing tasks for many language models, including GPT-3. In this work, we propose a new prompting framework, Thought Experiments, to teach language models to do better moral reasoning using counterfactuals. Experiment results show that our framework elicits counterfactual questions and answers from the model, which in turn helps improve the accuracy on Moral Scenarios task by 9-16 other zero-shot baselines. Interestingly, unlike math reasoning tasks, zero-shot Chain-of-Thought (CoT) reasoning doesn't work out of the box, and even reduces accuracy by around 4 observed that with minimal human supervision in the form of 5 few-shot examples, the accuracy of the task can be improved to as much as 80

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2022

Large Language Models are Zero-Shot Reasoners

Pretrained large language models (LLMs) are widely used in many sub-fiel...
research
11/15/2022

Reasoning Circuits: Few-shot Multihop Question Generation with Structured Rationales

Multi-hop Question Generation is the task of generating questions which ...
research
09/16/2023

EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning

Large language models primarily rely on incontext learning to execute ta...
research
04/22/2023

Boosting Theory-of-Mind Performance in Large Language Models via Prompting

Large language models (LLMs) excel in many tasks in 2023, but they still...
research
07/05/2023

Comparative Analysis of GPT-4 and Human Graders in Evaluating Praise Given to Students in Synthetic Dialogues

Research suggests that providing specific and timely feedback to human t...
research
04/27/2023

Federated Prompting and Chain-of-Thought Reasoning for Improving LLMs Answering

We investigate how to enhance answer precision in frequently asked quest...
research
05/17/2023

Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling

We introduce Reprompting, an iterative sampling algorithm that searches ...

Please sign up or login with your details

Forgot password? Click here to reset