On Large Language Models' Selection Bias in Multi-Choice Questions

09/07/2023
by   Chujie Zheng, et al.
0

Multi-choice questions (MCQs) serve as a common yet important task format in the research of large language models (LLMs). Our work shows that LLMs exhibit an inherent "selection bias" in MCQs, which refers to LLMs' preferences to select options located at specific positions (like "Option C"). This bias is prevalent across various LLMs, making their performance vulnerable to option position changes in MCQs. We identify that one primary cause resulting in selection bias is option numbering, i.e., the ID symbols A/B/C/D associated with the options. To mitigate selection bias, we propose a new method called PriDe. PriDe first decomposes the observed model prediction distribution into an intrinsic prediction over option contents and a prior distribution over option IDs. It then estimates the prior by permutating option contents on a small number of test samples, which is used to debias the subsequent test samples. We demonstrate that, as a label-free, inference-time method, PriDe achieves a more effective and computation-efficient debiasing than strong baselines. We further show that the priors estimated by PriDe generalize well across different domains, highlighting its practical potential in broader scenarios.

READ FULL TEXT

page 8

page 19

page 20

research
08/22/2023

Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions

Large Language Models (LLMs) have demonstrated remarkable capabilities i...
research
10/22/2022

Leveraging Large Language Models for Multiple Choice Question Answering

While large language models (LLMs) like GPT-3 have achieved impressive r...
research
08/22/2022

Selection Collider Bias in Large Language Models

In this paper we motivate the causal mechanisms behind sample selection ...
research
04/04/2019

ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions

The task of Reading Comprehension with Multiple Choice Questions, requir...
research
09/30/2022

Exploiting Selection Bias on Underspecified Tasks in Large Language Models

In this paper we motivate the causal mechanisms behind sample selection ...
research
05/24/2023

Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs

Have Large Language Models (LLMs) developed a personality? The short ans...
research
12/01/2022

Learning to Select from Multiple Options

Many NLP tasks can be regarded as a selection problem from a set of opti...

Please sign up or login with your details

Forgot password? Click here to reset