Entity-aware Transformers for Entity Search

05/02/2022
by   Emma J. Gerritse, et al.
0

Pre-trained language models such as BERT have been a key ingredient to achieve state-of-the-art results on a variety of tasks in natural language processing and, more recently, also in information retrieval.Recent research even claims that BERT is able to capture factual knowledge about entity relations and properties, the information that is commonly obtained from knowledge graphs. This paper investigates the following question: Do BERT-based entity retrieval models benefit from additional entity information stored in knowledge graphs? To address this research question, we map entity embeddings into the same input space as a pre-trained BERT model and inject these entity embeddings into the BERT model. This entity-enriched language model is then employed on the entity retrieval task. We show that the entity-enriched BERT model improves effectiveness on entity-oriented queries over a regular BERT model, establishing a new state-of-the-art result for the entity retrieval task, with substantial improvements for complex natural language queries and queries requesting a list of entities with a certain property. Additionally, we show that the entity information provided by our entity-enriched model particularly helps queries related to less popular entities. Last, we observe empirically that the entity-enriched BERT models enable fine-tuning on limited training data, which otherwise would not be feasible due to the known instabilities of BERT in few-sample fine-tuning, thereby contributing to data-efficient training of BERT for entity search.

READ FULL TEXT
research
08/06/2023

Spanish Pre-trained BERT Model and Evaluation Data

The Spanish language is one of the top 5 spoken languages in the world. ...
research
07/21/2020

Understanding BERT Rankers Under Distillation

Deep language models such as BERT pre-trained on large corpus have given...
research
10/14/2021

P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

Recent work (e.g. LAMA (Petroni et al., 2019)) has found that the qualit...
research
04/27/2022

Modern Baselines for SPARQL Semantic Parsing

In this work, we focus on the task of generating SPARQL queries from nat...
research
10/16/2021

Metadata Shaping: Natural Language Annotations for the Tail

Language models (LMs) have made remarkable progress, but still struggle ...
research
03/21/2023

Improving Content Retrievability in Search with Controllable Query Generation

An important goal of online platforms is to enable content discovery, i....
research
09/05/2019

Effective Use of Transformer Networks for Entity Tracking

Tracking entities in procedural language requires understanding the tran...

Please sign up or login with your details

Forgot password? Click here to reset