On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning

12/15/2022
by   Omar Shaikh, et al.
0

Generating a chain of thought (CoT) can increase large language model (LLM) performance on a wide range of tasks. Zero-shot CoT evaluations, however, have been conducted primarily on logical tasks (e.g. arithmetic, commonsense QA). In this paper, we perform a controlled evaluation of zero-shot CoT across two sensitive domains: harmful questions and stereotype benchmarks. We find that using zero-shot CoT reasoning in a prompt can significantly increase a model's likelihood to produce undesirable output. Without future advances in alignment or explicit mitigation instructions, zero-shot CoT should be avoided on tasks where models can make inferences about marginalized groups or harmful topics.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset