ExpertQA: Expert-Curated Questions and Attributed Answers

09/14/2023
by   Chaitanya Malaviya, et al.
0

As language models are adapted by a more sophisticated and diverse set of users, the importance of guaranteeing that they provide factually correct information supported by verifiable sources is critical across fields of study professions. This is especially the case for high-stakes fields, such as medicine and law, where the risk of propagating false information is high and can lead to undesirable societal consequences. Previous work studying factuality and attribution has not focused on analyzing these characteristics of language model outputs in domain-specific scenarios. In this work, we present an evaluation study analyzing various axes of factuality and attribution provided in responses from a few systems, by bringing domain experts in the loop. Specifically, we first collect expert-curated questions from 484 participants across 32 fields of study, and then ask the same experts to evaluate generated responses to their own questions. We also ask experts to revise answers produced by language models, which leads to ExpertQA, a high-quality long-form QA dataset with 2177 questions spanning 32 fields, along with verified answers and attributions for claims in the answers.

READ FULL TEXT

page 4

page 7

page 20

page 21

page 22

page 23

research
03/21/2022

Teaching language models to support answers with verified quotes

Recent large language models often answer factual questions correctly. B...
research
02/02/2023

Creating a Large Language Model of a Philosopher

Can large language models be trained to produce philosophical texts that...
research
04/21/2023

Who's the Best Detective? LLMs vs. MLs in Detecting Incoherent Fourth Grade Math Answers

Written answers to open-ended questions can have a higher long-term effe...
research
05/23/2023

Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models

This paper investigates the capabilities of Large Language Models (LLMs)...
research
04/06/2023

ChatGPT-Crawler: Find out if ChatGPT really knows what it's talking about

Large language models have gained considerable interest for their impres...
research
03/29/2023

Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams

The present study aims to explore the capabilities of Language Models (L...
research
08/17/2022

HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models

Controlling the text generated by language models and customizing the co...

Please sign up or login with your details

Forgot password? Click here to reset