RELIC: Retrieving Evidence for Literary Claims

03/18/2022
by   Katherine Thai, et al.
0

Humanities scholars commonly provide evidence for claims that they make about a work of literature (e.g., a novel) in the form of quotations from the work. We collect a large-scale dataset (RELiC) of 78K literary quotations and surrounding critical analysis and use it to formulate the novel task of literary evidence retrieval, in which models are given an excerpt of literary analysis surrounding a masked quotation and asked to retrieve the quoted passage from the set of all passages in the work. Solving this retrieval task requires a deep understanding of complex literary and linguistic phenomena, which proves challenging to methods that overwhelmingly rely on lexical and semantic similarity matching. We implement a RoBERTa-based dense passage retriever for this task that outperforms existing pretrained information retrieval baselines; however, experiments and analysis by human domain experts indicate that there is substantial room for improvement over our dense retriever.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2020

PatentMatch: A Dataset for Matching Patent Claims Prior Art

Patent examiners need to solve a complex information retrieval task when...
research
12/10/2021

Robust Information Retrieval for False Claims with Distracting Entities In Fact Extraction and Verification

Accurate evidence retrieval is essential for automated fact checking. Li...
research
09/15/2023

HealthFC: A Dataset of Health Claims for Evidence-Based Medical Fact-Checking

Seeking health-related advice on the internet has become a common practi...
research
07/22/2019

GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification

Fact verification (FV) is a challenging task which requires to retrieve ...
research
03/07/2022

Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

Passage retrieval is a fundamental task in information retrieval (IR) re...
research
06/21/2023

Resources and Evaluations for Multi-Distribution Dense Information Retrieval

We introduce and define the novel problem of multi-distribution informat...
research
09/08/2023

Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges

Unstructured Electronic Health Record (EHR) data often contains critical...

Please sign up or login with your details

Forgot password? Click here to reset