Finding Inverse Document Frequency Information in BERT

02/24/2022
by   Jaekeol Choi, et al.
0

For many decades, BM25 and its variants have been the dominant document retrieval approach, where their two underlying features are Term Frequency (TF) and Inverse Document Frequency (IDF). The traditional approach, however, is being rapidly replaced by Neural Ranking Models (NRMs) that can exploit semantic features. In this work, we consider BERT-based NRMs and study if IDF information is present in the NRMs. This simple question is interesting because IDF has been indispensable for the traditional lexical matching, but global features like IDF are not explicitly learned by neural language models including BERT. We adopt linear probing as the main analysis tool because typical BERT based NRMs utilize linear or inner-product based score aggregators. We analyze input embeddings, representations of all BERT layers, and the self-attention weights of CLS. By studying MS-MARCO dataset with three BERT-based models, we show that all of them contain information that is strongly dependent on IDF.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/15/2019

CEDR: Contextualized Embeddings for Document Ranking

Although considerable attention has been given to neural ranking archite...
research
12/15/2020

Traditional IR rivals neural models on the MS MARCO Document Ranking Leaderboard

This short document describes a traditional IR system that achieved MRR@...
research
04/15/2019

Contextualized Word Representations for Document Re-Ranking

Although considerable attention has been given to neural ranking archite...
research
04/04/2023

San-BERT: Extractive Summarization for Sanskrit Documents using BERT and it's variants

In this work, we develop language models for the Sanskrit language, name...
research
08/19/2020

UoB at SemEval-2020 Task 12: Boosting BERT with Corpus Level Information

Pre-trained language model word representation, such as BERT, have been ...
research
10/11/2019

exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models

Large language models can produce powerful contextual representations th...

Please sign up or login with your details

Forgot password? Click here to reset