Learnt Sparsity for Effective and Interpretable Document Ranking

06/23/2021
by   Jurek Leonhardt, et al.
0

Machine learning models for the ad-hoc retrieval of documents and passages have recently shown impressive improvements due to better language understanding using large pre-trained language models. However, these over-parameterized models are inherently non-interpretable and do not provide any information on the parts of the documents that were used to arrive at a certain prediction. In this paper we introduce the select and rank paradigm for document ranking, where interpretability is explicitly ensured when scoring longer documents. Specifically, we first select sentences in a document based on the input query and then predict the query-document score based only on the selected sentences, acting as an explanation. We treat sentence selection as a latent variable trained jointly with the ranker from the final output. We conduct extensive experiments to demonstrate that our inherently interpretable select-and-rank approach is competitive in comparison to other state-of-the-art methods and sometimes even outperforms them. This is due to our novel end-to-end training approach based on weighted reservoir sampling that manages to train the selector despite the stochastic sentence selection. We also show that our sentence selection approach can be used to provide explanations for models that operate on only parts of the document, such as BERT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2018

Neural Document Summarization by Jointly Learning to Score and Select Sentences

Sentence scoring and sentence selection are two main steps in extractive...
research
03/30/2021

An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Recently introduced pre-trained contextualized autoregressive models lik...
research
04/15/2019

CEDR: Contextualized Embeddings for Document Ranking

Although considerable attention has been given to neural ranking archite...
research
05/20/2021

Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking

An emerging recipe for achieving state-of-the-art effectiveness in neura...
research
06/22/2019

RLTM: An Efficient Neural IR Framework for Long Documents

Deep neural networks have achieved significant improvements in informati...
research
11/25/2017

Neural Ranking Models with Multiple Document Fields

Deep neural networks have recently shown promise in the ad-hoc retrieval...
research
03/26/2019

Simple Applications of BERT for Ad Hoc Document Retrieval

Following recent successes in applying BERT to question answering, we ex...

Please sign up or login with your details

Forgot password? Click here to reset