Evaluating Dense Passage Retrieval using Transformers

08/15/2022
by   Nima Sadri, et al.
0

Although representational retrieval models based on Transformers have been able to make major advances in the past few years, and despite the widely accepted conventions and best-practices for testing such models, a standardized evaluation framework for testing them has not been developed. In this work, we formalize the best practices and conventions followed by researchers in the literature, paving the path for more standardized evaluations - and therefore more fair comparisons between the models. Our framework (1) embeds the documents and queries; (2) for each query-document pair, computes the relevance score based on the dot product of the document and query embedding; (3) uses the set of the MSMARCO dataset to evaluate the models; (4) uses the script to calculate MRR@100, which is the primary metric used to evaluate the models. Most importantly, we showcase the use of this framework by experimenting on some of the most well-known dense retrieval models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2022

Curriculum Sampling for Dense Retrieval with Document Expansion

The dual-encoder has become the de facto architecture for dense retrieva...
research
11/08/2018

An Axiomatic Study of Query Terms Order in Ad-hoc Retrieval

Classic retrieval methods use simple bag-of-word representations for que...
research
10/22/2021

Wacky Weights in Learned Sparse Representations and the Revenge of Score-at-a-Time Query Evaluation

Recent advances in retrieval models based on learned sparse representati...
research
07/31/2023

Lexically-Accelerated Dense Retrieval

Retrieval approaches that score documents based on learned dense vectors...
research
01/21/2022

Less is Less: When Are Snippets Insufficient for Human vs Machine Relevance Estimation?

Traditional information retrieval (IR) ranking models process the full t...
research
08/16/2021

My Fuzzer Beats Them All! Developing a Framework for Fair Evaluation and Comparison of Fuzzers

Fuzzing has become one of the most popular techniques to identify bugs i...
research
06/13/2023

Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard

BEIR is a benchmark dataset for zero-shot evaluation of information retr...

Please sign up or login with your details

Forgot password? Click here to reset