REL: An Entity Linker Standing on the Shoulders of Giants

Entity linking is a standard component in modern retrieval system that is often performed by third-party toolkits. Despite the plethora of open source options, it is difficult to find a single system that has a modular architecture where certain components may be replaced, does not depend on external sources, can easily be updated to newer Wikipedia versions, and, most important of all, has state-of-the-art performance. The REL system presented in this paper aims to fill that gap. Building on state-of-the-art neural components from natural language processing research, it is provided as a Python package as well as a web API. We also report on an experimental comparison against both well-established systems and the current state-of-the-art on standard entity linking benchmarks.



There are no comments yet.


page 1

page 2

page 3

page 4


On the Temporality of Priors in Entity Linking

Entity linking is a fundamental task in natural language processing whic...

Empirical Evaluation of Pretraining Strategies for Supervised Entity Linking

In this work, we present an entity linking model which combines a Transf...

CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata

In this paper, we propose CHOLAN, a modular approach to target end-to-en...

MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

Entity retrieval, which aims at disambiguating mentions to canonical ent...

What do Entity-Centric Models Learn? Insights from Entity Linking in Multi-Party Dialogue

Humans use language to refer to entities in the external world. Motivate...

Neural Entity Linking on Technical Service Tickets

Entity linking, the task of mapping textual mentions to known entities, ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Entity linking (EL) refers to the task of recognizing mentions of specific entities in text and assigning unique identifiers to them from an underlying knowledge repository (Balog, 2018). The problems of entity recognition and disambiguation have traditionally been studied in the natural language processing (NLP) community. It was also them who first recognized the utility of Wikipedia as a large-scale knowledge repository to disambiguate against (Bunescu and Paşca, 2006; Cucerzan, 2007). This line of work has been quickly followed up by information retrieval (IR) researchers (Milne and Witten, 2008; Mihalcea and Csomai, 2007). Over the past years, entity linking has become a standard component in modern retrieval systems, and has been leveraged in a range of tasks, including document ranking (Xiong et al., 2017), entity retrieval (Hasibi et al., 2016), knowledge base population (Balog et al., 2013), and query recommendation (Reinanda et al., 2015). Since entity linking is not the main focus of these works, it is commonly performed by some third-party toolkit, with the resulting annotations being utilized in downstream processing. Some of the most prominent toolkits used for this purpose include DBpedia Spotlight (Mendes et al., 2011), TAGME (Ferragina and Scaiella, 2010), WAT (Piccinno et al., 2014), and FEL (Pappu et al., 2017).

Existing toolkits fall short in a number of areas. Some are unmaintained (Pappu et al., 2017); others are meant for short text and inefficient for long text (Hasibi et al., 2017); some rely on external sources like web search engines (Cornolti et al., 2018). Typically, they are shipped with a specific Wikipedia version that has become dated, causing difficulties when attempting to update to a recent Wikipedia (Piccinno et al., 2014; Cornolti et al., 2018)

. An issue that is often not addressed is the lack of speed (throughput). Most importantly, none of the default open source entity linkers incorporate recent progress made in the NLP community on neural network-based approaches 

(Kolitsas et al., 2018). With this work, we aim to close that gap and remedy all of these problems by introducing an efficient, up-to-date entity linker that has a modular architecture to ease, e.g., updates of external resources like Wikipedia.

We present REL111REL in Dutch means mayhem, interference, or disturbance; and, it is easily recognized to abbreviates ‘relatie’ (relation in English). (which stands for Radboud Entity Linker), an open source toolkit for entity linking. REL stands on the shoulders of giants and is an ensemble of multiple methods and packages from the state-of-the-art natural language processing research. REL has been developed with the following design considerations:

  • [leftmargin=3mm]

  • Use state-of-the-art approaches for entity disambiguation (ED) (Le and Titov, 2018; Ganea and Hofmann, 2017)

    and named entity recognition (NER) 

    (Akbik et al., 2018), ensuring it is on par with the state-of-the-art on end-to-end entity linking (Kolitsas et al., 2018).

  • Use a modular architecture with mention detection (using a NER approach) and entity disambiguation components. Specifically, separating mention detection from entity disambiguation enables us to choose an NER method appropriate for the context in which entity linking is employed (i.e., optimizing for recall vs. throughput).

  • Design for sufficient throughput; reporting 700 ms for an average document of 300 words. Notably, most of the time is used for NER, which could be changed to a more efficient option.

  • Develop a lightweight solution that can be deployed on an average laptop/desktop machine; it does not need much RAM, and, importantly, it does not need a GPU.

  • Train on a recent Wikipedia dump (2019-07) and ensure easy updates to new Wikipedia versions (all necessary scripts included).

REL is available at under a MIT license, can be deployed as a Python package, or used via a restful API.

2. Entity linking in REL

In this section, we present the entity linking method underlying REL. We follow a standard entity linking pipeline architecture (Balog, 2018), consisting of three components: (i) mention detection, (ii) candidate selection, and (iii) entity disambiguation.

2.1. Mention Detection

In the mention detection step, we aim to detect all text spans that can be linked to entities. These text spans, referred to as mentions, are obtained by employing a Named Entity Recognition (NER) tool. NER taggers detect entity mentions in text and annotate them with (coarse-grained) entity types (Balog, 2018). We employ Flair (Akbik et al., 2018), a state-of-the-art NER based on contextualized word embeddings. Flair takes the input to be a sequence of characters and passes it to a bidirectional character-level neural language model to generate a contextual string embedding for each word. These embeddings are then utilized in a sequence labeling module to generate tags for NER.

Using a NER method for mention detection enables us to strike a balance between precision and recall. Another approach, which may result in high recall, is matching all n-grams (up to a certain

) in the input text against a rich dictionary of entity names (Hasibi et al., 2015; Balog, 2018). In REL, the mention detection component can easily be replaced by another NER tagger such as spaCy222 or by a dictionary-based approach.

2.2. Candidate Selection

For each text span detected as a mention, we select up to (=7) candidate entities (following (Ganea and Hofmann, 2017)). The (=4) candidate entities are selected from the top ranked entities based on the mention-entity prior , for a given entity and a mention . To compute this prior, we sum up hyperlink counts from Wikipedia and from the CrossWikis corpus (Spitkovsky and Chang, 2012)

to estimate probability

. A uniform probability is also extracted from YAGO dictionary (Hoffart et al., 2011). These two probabilities are combined into the final prior as  (Ganea and Hofmann, 2017).

The other (=3) candidate entities are chosen based on their similarity to the context of the mention. This similarity score is obtained by , where is n-word () context surrounding mention and w and e

are entity and word embedding vectors. This score is computed for

(=30) entities with the highest prior and the top- entities are added to the list of candidate entities (Ganea and Hofmann, 2017).

In REL, we use Wikipedia2Vec word and entity embeddings (Yamada et al., 2016) to estimate the similarity between an entity and a mention’s local context. Wikipedia2Vec jointly learns word and entity embeddings from Wikipedia text and link structure, and is available as an open source library (Yamada et al., 2018). The hyper-parameters , , , and are set based on the recommended values in (Le and Titov, 2018; Ganea and Hofmann, 2017).

2.3. Entity Disambiguation

In the entity disambiguation step, we link mentions to their corresponding entities in the knowledge graph (here: Wikipedia). Entity disambiguation in REL is based on the Ment-norm method proposed by

Le and Titov (2018). Given an input document , the entity linking decisions are made by combining local compatibility (which includes prior importance and contextual similarity) and coherence with the other entity linking decisions in the document:


where denotes the set of candidate entities for mention and . The coherence score between entity and its local context is computed by the function as defined in (Ganea and Hofmann, 2017), and the coherence between all entity linking decisions is captured by the function . Le and Titov (2018) compute the function by incorporating relations between mentions of a document. Assuming latent relations, is calculated as:


where are the embeddings of entities (using the same embeddings as in the candidate selection step), is a diagonal matrix, and is a normalized score defined as:


where is a diagonal matrix, and function is a single-layer neural network that maps mention and its context to a -dimensional vector. is a normalization factor over and is computed as:


The optimization of Eq. (1) is performed using max-product loopy belief propagation (LBP), and the final score for an entity of a mention is obtained by a two-layer neural network that combines with max-marginal probability of an entity for a given document. The training of the model, referred to as the ED model

henceforth, is performed using max-margin loss. To estimate posterior probabilities of the linked entities, we fit a logistic function over the final scores obtained by the neural model 

(Platt, 2000).

3. Implementation and Usage

Next, we describe the implementation details and usage of REL.

3.1. Implementation Details

Memory and GPU usage. One of the design requirements of REL is being lightweight, such that it can be deployed on an average machine. To minimize memory requirements, we store Wikipedia2Vec entity and word embeddings, GloVe embeddings, and an index of pre-computed values (i.e., a surface form dictionary) in a SQLite3333 database. Using SQLite, we are able to minimize memory usage for our API to 1.8GB if the user chooses to not preload embeddings. REL also does not require GPU during inference. The neural model used for entity disambiguation is a feed-forward network and does not require heavy CPU/GPU usage. Training of Wikipedia2Vec embeddings, however, requires high memory and is done more efficiently using a GPU.

REL components. REL has a modular architecture, with separate components for mention detection, entity disambiguation, and the generation of the index. The mention detection component is based on the Flair package444

and can be easily replaced by another mention detection approach. The disambiguation component is implemented using PyTorch and based on the source code of 

(Le and Titov, 2018).555 The generation of the index is based on the source code of (Ganea and Hofmann, 2017) and involves the parsing of Wikipedia, the CrossWikis corpus, and YAGO. Any of these may be either removed completely, or replaced by different corpora; using the resulting index in the package instead.

ED Training. For the entity disambiguation method, we used the AIDA-train dataset for training and AIDA-A for validation. We use the Adam optimizer and reduce the learning rate from to once the F1-score of the validation set reaches (following (Le and Titov, 2018)).

Embeddings. The entity and word embeddings used for selecting candidate entities are trained on a Wikipedia 2019-07 dump using the Wikipedia2Vec package.666 Following (Gerritse et al., ), we set the min-entity-count parameter to zero and used the Wikipedia link graph during training. For the entity disambiguation model, we used GloVe embeddings (Pennington et al., 2014) as suggested in (Le and Titov, 2018).

 {"text": "Belgrade 1996-08-30 Result in an international basketball tournament on Friday: Red Star ( Yugoslavia ) beat Dinamo ( Russia) 92-90 ( halftime 47-47 )."}
[0, 8, Belgrade’, Belgrade’, 0.91, 0.98, LOC’, ],
[80, 8, Red Star’, KK_Crvena_zvezda’, 0.36, 0.99, ORG’],
[91, 10, Yugoslavia’, Yugoslavia’, 0.8, 0.99, LOC’],
[109, 6, Dinamo’, FC_Dinamo_Bucuresti’, 0.7, 0.99, ORG’],
[118, 6, Russia’, Russia’, 0.85, 0.99, LOC’]
Figure 1. Example API input and output for entity linking.

3.2. Usage

REL can be used as a Python package deployed on a local machine, or as a service, via a restful API.

To use REL as a package, our GitHub repository contains step-by-step tutorials on how to perform end-to-end entity linking, and on how to (re-)train the ED model. We provide scripts and instructions for deploying REL using a new Wikipedia dump; this helps REL users to keep up-to-date with emerging entities in Wikipedia, and enables researchers to deploy REL for any specific Wikipedia version that is required for a downstream task.

The API is publicly available. Given an input text, depicted in Fig. 1 (Top), the API returns a list of mentions, each with (i) the start position and length of the mention, (ii) the mention itself, (iii) the linked entity, (iv) the confidence score of ED, and (vi) the confidence score and type of entity from the mention detection step (if available); see Fig. 1 (Bottom). Alternatively, a user can use the API for entity disambiguation only, by submitting an input text and a list of spans (specified with start position and length).

4. Evaluation









Macro F1
Micro F1
DBpedia 52.0 42.4 42.0 41.4 21.5 26.7 33.7 29.4
Spotlight 57.8 40.6 44.4 43.1 24.8 27.2 32.2 34.9
WAT 70.8 62.6 53.2 51.8 45.0 45.3 44.4 37.3
73.0 64.5 56.4 53.9 49.2 42.3 38.0 49.6
SOTA NLP 82.6 73.0 56.6 47.8 45.4 43.8 43.2 26.2
82.4 72.4 61.9 52.7 50.3 38.2 34.1 35.2
REL (2014) 81.3 73.2 61.5 57.5 46.8 35.9 38.1 60.1
83.3 74.4 64.8 58.8 49.7 34.3 41.2 61.6
REL (2019) 78.6 71.1 61.8 57.4 45.7 36.2 38.0 50.1
80.5 72.4 63.1 58.3 49.9 35.0 41.1 50.7
Table 1. EL strong matching results on the GERBIL platform.









Macro F1
Micro F1
DBpedia 53.7 43.6 30.4 43.0 41.8 42.6 50.3 48.7
Spotlight 56.1 42.1 35.8 43.1 43.4 34.6 43.3 52.3
WAT 79.8 79.7 62.2 0.0 59.2 62.8 70.4 52.4
80.5 78.8 64.9 0.0 63.1 63.9 69.5 62.2
SOTA NLP 83.8 88.5 73.2 76.7 63.4 66.6 65.3 52.4
83.0 86.2 74.0 78.1 67.3 68.6 65.4 60.8
REL (2014) 85.5 89.6 65.5 72.0 59.8 61.0 61.9 61.9
86.6 88.5 65.8 72.2 64.9 62.8 62.1 64.6
REL (2019) 82.9 86.3 64.0 67.0 58.2 61.7 62.3 54.4
84.0 85.8 64.3 67.3 64.9 64.1 62.0 54.0
Table 2. ED results on the GERBIL platform.

We compare REL with a state-of-the-art end-to-end entity linking (Kolitsas et al., 2018), referred to as SOTA NLP, and two popular well-established entity linking systems: (i) DBpedia-spotlight (Mendes et al., 2011) and (ii) WAT (Piccinno et al., 2014), the updated version of TagMe (Ferragina and Scaiella, 2010). We report the results for two versions of our system. The first one, denoted as REL (2014), is based on the original implementation of (Le and Titov, 2018) for ED. It uses Wikipedia 2014 as the reference knowledge base and employs entity embeddings provided by (Ganea and Hofmann, 2017) for candidate selection. The second version of our system, denoted as REL (2019), is based on Wikipedia 2019-07 and uses Wikipedia2Vec embeddings; cf. Section 3.

We use the GERBIL platform (Röder et al., 2018) for evaluation, and report on micro and macro InKB F1 scores for both EL and ED. Table 1 shows the strong matching results for EL, where strong refers to the requirement of exactly predicting the gold mention boundaries. We first note that REL outperforms the well-established entity linking toolkits (DBpedia Spotlight and WAT) by a large margin. Comparing with SOTA NLP, we observe that REL (2019) outperforms (or performs on par with) SOTA NLP on half of the datasets. The ED results in Table 2 also show consistent and significant improvements of REL over the two well-established toolkits. SOTA NLP, however, obtains better results than REL for all, except three datasets. For both EL and ED results, we observe that REL (2014) achieves better results compared to REL (2019). This can be attributed to the different embeddings used for candidate selection: the recall of candidate entities chosen by their similarity to the context of the mentions is lower in REL (2019) when compared to REL (2014).

For a reference comparison, we also report the results of the ED method (referred to as MulRel-NEL) as reported in (Le and Titov, 2018); see Table 3. The micro F1 score reported in this table is computed locally and by matching ED results against the original datasets. The results show that REL (2014) and MulRel-NEL scores are almost identical, which attests to the repeatability of (Le and Titov, 2018). Again, we observe a decrease in performance when comparing REL (2019) to REL (2014), just like in Table 2.

Finally, we report on the runtime efficiency of REL in Table 4. Specifically, we measure efficiency on a random sample of 50 documents (with a minimum length of 200 words) taken from AIDA-B. The experiments were run on a laptop with Intel i7 CPU (2.80GHz), 16GB RAM, and an NVIDIA Geforce GTX 1050 (4GB) GPU. The results show that detecting the mentions takes considerably more time than ED, and is done more efficiently using GPU. The ED time, however, is less affected by the GPU usage. This indicates that the overall efficiency of REL can be improved by replacing MD with a more efficient NER approach.







Micro F1
MulRel-NEL (Le and Titov, 2018) 93.1 89.9 88.3 77.5 93.9 78.0
REL (2014) 92.8 89.7 87.4 77.6 93.5 78.7
REL (2019) 89.4 85.3 84.1 71.9 90.7 73.1
Table 3. Local ED results as reported in (Le and Titov, 2018)
Time MD Time ED
With GPU 0.440.22 0.240.08
Without GPU 2.411.24 0.180.09
Table 4. Efficiency of REL (in seconds) for 50 documents from AIDA-B with 200 words, which is 323 ( 105) words and 42 ( 19) mentions per document.

5. Conclusion

We have introduced the Radboud Entity Linker (REL), an open source toolkit for entity linking. REL builds on state-of-the-art neural components from natural language processing research, and is provided as a Python package and as a web API. Currently, REL is optimized for annotating documents and short texts. In the future, we plan to train REL on a large corpus of annotated queries and make it available for the task of entity linking in queries as well.


  • A. Akbik, D. Blythe, and R. Vollgraf (2018) Contextual string embeddings for sequence labeling. In Proc. of COLING ’18, pp. 1638–1649. Cited by: 1st item, §2.1.
  • K. Balog, H. Ramampiaro, N. Takhirov, and K. Nørvåg (2013) Multi-step classification approaches to cumulative citation recommendation. In Proc. of OAIR ’13, pp. 121–128. Cited by: §1.
  • K. Balog (2018) Entity-oriented search. The Information Retrieval Series, Vol. 39, Springer. Cited by: §1, §2.1, §2.1, §2.
  • R. Bunescu and M. Paşca (2006) Using encyclopedic knowledge for named entity disambiguation. In Proc. of EACL ’06, pp. 9–16. Cited by: §1.
  • M. Cornolti, P. Ferragina, M. Ciaramita, S. Rüd, and H. Schütze (2018) SMAPH: a piggyback approach for entity-linking in web queries. ACM Trans. Inf. Syst. 37 (1). Cited by: §1.
  • S. Cucerzan (2007) Large-scale named entity disambiguation based on Wikipedia data. In Proc. of EMNLP-CoNLL ’07, pp. 708–716. Cited by: §1.
  • P. Ferragina and U. Scaiella (2010) TAGME: On-the-fly annotation of short text fragments (by Wikipedia entities). In Proc. of CIKM ’10, pp. 1625–1628. Cited by: §1, §4.
  • O. Ganea and T. Hofmann (2017) Deep joint entity disambiguation with local neural attention. In Proc. of EMNLP ’17, pp. 2619–2629. Cited by: 1st item, §2.2, §2.2, §2.2, §2.3, §3.1, §4.
  • [9] E. Gerritse, F. Hasibi, and A. P. de Vries Graph-embedding empowered entity retrieval. In Proc. of ECIR ’20, pp. 97–110. Cited by: §3.1.
  • F. Hasibi, K. Balog, and S. E. Bratsberg (2015) Entity linking in queries: tasks and evaluation. In Proc. of ICTIR ’15, pp. 171–180. Cited by: §2.1.
  • F. Hasibi, K. Balog, and S. E. Bratsberg (2016) Exploiting entity linking in queries for entity retrieval. In Proc. of ICTIR ’16, pp. 209–218. Cited by: §1.
  • F. Hasibi, K. Balog, D. Garigliotti, and S. Zhang (2017) Nordlys: a toolkit for entity-oriented and semantic search. In Proc. of SIGIR ’17, pp. 1289–1292. Cited by: §1.
  • J. Hoffart, M. A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum (2011) Robust disambiguation of named entities in text. In Proc. of EMNLP ’11, pp. 782–792. Cited by: §2.2.
  • N. Kolitsas, O. Ganea, and T. Hofmann (2018) End-to-end neural entity linking. In Proc. of CoNLL ’18, pp. 519–529. Cited by: 1st item, §1, §4.
  • P. Le and I. Titov (2018) Improving entity linking by modeling latent relations between mentions. In Proc. of ACL ’18, pp. 1595–1604. Cited by: 1st item, §2.2, §2.3, §3.1, §3.1, §3.1, Table 3, §4, §4.
  • P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer (2011) DBpedia Spotlight: shedding light on the web of documents. In Proc. of I-Semantics ’11, pp. 1–8. Cited by: §1, §4.
  • R. Mihalcea and A. Csomai (2007) Wikify! - Linking documents to encyclopedic knowledge. In Proc. of CIKM ’07, pp. 233–242. Cited by: §1.
  • D. Milne and I. H. Witten (2008) Learning to link with Wikipedia. In Proc. of CIKM ’08, pp. 509–518. Cited by: §1.
  • A. Pappu, R. Blanco, Y. Mehdad, A. Stent, and K. Thadani (2017) Lightweight multilingual entity extraction and linking. In Proc. of WSDM ’17, pp. 365–374. Cited by: §1, §1.
  • J. Pennington, R. Socher, and C. Manning (2014) Glove: global vectors for word representation. In Proc. of EMNLP ’14, pp. 1532–1543. Cited by: §3.1.
  • F. Piccinno, P. Ferragina, and D. Informatica (2014) From TagME to WAT : a new entity annotator categories and subject descriptors. pp. 55–62. Cited by: §1, §1, §4.
  • J. Platt (2000) Probabilities for sv machines. In

    Advances in Large-Margin Classifiers

    pp. 61–73. Cited by: §2.3.
  • R. Reinanda, E. Meij, and M. de Rijke (2015) Mining, ranking and recommending entity aspects. In Proc. of SIGIR ’15, pp. 263–272. Cited by: §1.
  • M. Röder, R. Usbeck, and A. Ngonga Ngomo (2018) GERBIL–benchmarking named entity recognition and linking consistently. Semantic Web 9 (5), pp. 605–625. Cited by: §4.
  • V. I. Spitkovsky and A. X. Chang (2012) A cross-lingual dictionary for English Wikipedia concepts. In Proc. of LREC’12, pp. 3168–3175. Cited by: §2.2.
  • C. Xiong, J. Callan, and T. Liu (2017) Word-entity duet representations for document ranking. In Proc. of SIGIR ’17, pp. 763–772. Cited by: §1.
  • I. Yamada, A. Asai, H. Shindo, H. Takeda, and Y. Takefuji (2018) Wikipedia2Vec: an optimized tool for learning embeddings of words and entities from wikipedia. arXiv preprint 1812.06280. Cited by: §2.2.
  • I. Yamada, H. Shindo, H. Takeda, and Y. Takefuji (2016) Joint learning of the embedding of words and entities for named entity disambiguation. In Proc of CoNLL ’16, pp. 250–259. Cited by: §2.2.