Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement

12/27/2021
by   Fuwei Zhang, et al.
7

Cross-Lingual Information Retrieval (CLIR) aims to rank the documents written in a language different from the user's query. The intrinsic gap between different languages is an essential challenge for CLIR. In this paper, we introduce the multilingual knowledge graph (KG) to the CLIR task due to the sufficient information of entities in multiple languages. It is regarded as a "silver bullet" to simultaneously perform explicit alignment between queries and documents and also broaden the representations of queries. And we propose a model named CLIR with hierarchical knowledge enhancement (HIKE) for our task. The proposed model encodes the textual information in queries, documents and the KG with multilingual BERT, and incorporates the KG information in the query-document matching process with a hierarchical information fusion mechanism. Particularly, HIKE first integrates the entities and their neighborhood in KG into query representations with a knowledge-level fusion, then combines the knowledge from both source and target languages to further mitigate the linguistic gap with a language-level fusion. Finally, experimental results demonstrate that HIKE achieves substantial improvements over state-of-the-art competitors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2019

Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations

In this paper, we propose to boost low-resource cross-lingual document r...
research
09/07/2021

Mixed Attention Transformer for Leveraging Word-Level Knowledge to Neural Cross-Lingual Information Retrieval

Pretrained contextualized representations offer great success for many d...
research
04/24/2020

Cross-lingual Information Retrieval with BERT

Multiple neural language models have been developed recently, e.g., BERT...
research
11/03/2021

Leveraging Advantages of Interactive and Non-Interactive Models for Vector-Based Cross-Lingual Information Retrieval

Interactive and non-interactive model are the two de-facto standard fram...
research
10/15/2019

Aligning Cross-Lingual Entities with Multi-Aspect Information

Multilingual knowledge graphs (KGs), such as YAGO and DBpedia, represent...
research
09/03/2022

Multilingual ColBERT-X

ColBERT-X is a dense retrieval model for Cross Language Information Retr...
research
07/19/2022

QuoteKG: A Multilingual Knowledge Graph of Quotes

Quotes of public figures can mark turning points in history. A quote can...

Please sign up or login with your details

Forgot password? Click here to reset