Statistical and Neural Methods for Cross-lingual Entity Label Mapping in Knowledge Graphs

06/17/2022
by   Gabriel Amaral, et al.
0

Knowledge bases such as Wikidata amass vast amounts of named entity information, such as multilingual labels, which can be extremely useful for various multilingual and cross-lingual applications. However, such labels are not guaranteed to match across languages from an information consistency standpoint, greatly compromising their usefulness for fields such as machine translation. In this work, we investigate the application of word and sentence alignment techniques coupled with a matching algorithm to align cross-lingual entity labels extracted from Wikidata in 10 languages. Our results indicate that mapping between Wikidata's main labels stands to be considerably improved (up to 20 points in F1-score) by any of the employed methods. We show how methods relying on sentence embeddings outperform all others, even across different scripts. We believe the application of such techniques to measure the similarity of label pairs, coupled with a knowledge base rich in high-quality entity labels, to be an excellent asset to machine translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2021

XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment

Cross-lingual named-entity lexicon are an important resource to multilin...
research
08/16/2017

Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Entity alignment is the task of finding entities in two knowledge bases ...
research
05/23/2023

Linear Cross-Lingual Mapping of Sentence Embeddings

Semantics of a sentence is defined with much less ambiguity than semanti...
research
08/31/2019

Entity Projection via Machine-Translation for Cross-Lingual NER

Although over 100 languages are supported by strong off-the-shelf machin...
research
11/12/2016

Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment

Many recent works have demonstrated the benefits of knowledge graph embe...
research
06/12/2023

Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization

Multilingual sentence representations are the foundation for similarity-...

Please sign up or login with your details

Forgot password? Click here to reset