HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

07/31/2023
by   Ehsan Kamalloo, et al.
0

The rise of large language models (LLMs) had a transformative impact on search, ushering in a new era of search engines that are capable of generating search results in natural language text, imbued with citations for supporting sources. Building generative information-seeking models demands openly accessible datasets, which currently remain lacking. In this paper, we introduce a new dataset, HAGRID (Human-in-the-loop Attributable Generative Retrieval for Information-seeking Dataset) for building end-to-end generative information-seeking models that are capable of retrieving candidate quotes and generating attributed explanations. Unlike recent efforts that focus on human evaluation of black-box proprietary search engines, we built our dataset atop the English subset of MIRACL, a publicly available information retrieval dataset. HAGRID is constructed based on human and LLM collaboration. We first automatically collect attributed explanations that follow an in-context citation style using an LLM, i.e. GPT-3.5. Next, we ask human annotators to evaluate the LLM explanations based on two criteria: informativeness and attributability. HAGRID serves as a catalyst for the development of information-seeking models with better attribution capabilities.

READ FULL TEXT

page 3

page 5

research
05/24/2023

Enabling Large Language Models to Generate Text with Citations

Large language models (LLMs) have emerged as a widely-used tool for info...
research
04/19/2023

Evaluating Verifiability in Generative Search Engines

Generative search engines directly generate responses to user queries, a...
research
10/05/2021

Voice Information Retrieval In Collaborative Information Seeking

Voice information retrieval is a technique that provides Information Ret...
research
02/11/2023

Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models

Despite recent progress, it has been difficult to prevent semantic hallu...
research
02/20/2019

World Discovery Models

As humans we are driven by a strong desire for seeking novelty in our wo...
research
05/10/2023

Automatic Evaluation of Attribution by Large Language Models

A recent focus of large language model (LLM) development, as exemplified...
research
04/02/2019

Asking the Right Question: Inferring Advice-Seeking Intentions from Personal Narratives

People often share personal narratives in order to seek advice from othe...

Please sign up or login with your details

Forgot password? Click here to reset