CSFCube – A Test Collection of Computer Science Research Articles for Faceted Query by Example

03/24/2021
by   Sheshera Mysore, et al.
13

Query by Example is a well-known information retrieval task in which a document is chosen by the user as the search query and the goal is to retrieve relevant documents from a large collection. However, a document often covers multiple aspects of a topic. To address this scenario we introduce the task of faceted Query by Example in which users can also specify a finer grained aspect in addition to the input query document. We focus on the application of this task in scientific literature search. We envision models which are able to retrieve scientific papers analogous to a query scientific paper along specifically chosen rhetorical structure elements as one solution to this problem. In this work, the rhetorical structure elements, which we refer to as facets, indicate "background", "method", or "result" aspects of a scientific paper. We introduce and describe an expert annotated test collection to evaluate models trained to perform this task. Our test collection consists of a diverse set of 50 query documents, drawn from computational linguistics and machine learning venues. We carefully followed the annotation guideline used by TREC for depth-k pooling (k = 100 or 250) and the resulting data collection consists of graded relevance scores with high annotation agreement. The data is freely available for research purposes.

READ FULL TEXT

page 1

page 4

research
01/29/2020

Aspect-based Academic Search using Domain-specific KB

Academic search engines allow scientists to explore related work relevan...
research
12/21/2022

AgAsk: An Agent to Help Answer Farmer's Questions From Scientific Documents

Decisions in agriculture are increasingly data-driven; however, valuable...
research
08/14/2022

TripJudge: A Relevance Judgement Test Collection for TripClick Health Retrieval

Robust test collections are crucial for Information Retrieval research. ...
research
12/24/2016

JU_KS_Group@FIRE 2016: Consumer Health Information Search

In this paper, we describe the methodology used and the results obtained...
research
10/24/2018

History by Diversity: Helping Historians search News Archives

Longitudinal corpora like newspaper archives are of immense value to his...
research
03/06/2023

LongEval-Retrieval: French-English Dynamic Test Collection for Continuous Web Search Evaluation

LongEval-Retrieval is a Web document retrieval benchmark that focuses on...
research
06/15/2020

Document Classification for COVID-19 Literature

The global pandemic has made it more important than ever to quickly and ...

Please sign up or login with your details

Forgot password? Click here to reset