Cross-Lingual Fine-Grained Entity Typing

10/15/2021
by   Nila Selvaraj, et al.
0

The growth of cross-lingual pre-trained models has enabled NLP tools to rapidly generalize to new languages. While these models have been applied to tasks involving entities, their ability to explicitly predict typological features of these entities across languages has not been established. In this paper, we present a unified cross-lingual fine-grained entity typing model capable of handling over 100 languages and analyze this model's ability to generalize to languages and entities unseen during training. We train this model on cross-lingual training data collected from Wikipedia hyperlinks in multiple languages (training languages). During inference, our model takes an entity mention and context in a particular language (test language, possibly not in the training languages) and predicts fine-grained types for that entity. Generalizing to new languages and unseen entities are the fundamental challenges of this entity typing setup, so we focus our evaluation on these settings and compare against simple yet powerful string match baselines. Experimental results show that our approach outperforms the baselines on unseen languages such as Japanese, Tamil, Arabic, Serbian, and Persian. In addition, our approach substantially improves performance on unseen entities (even in unseen languages) over the baselines, and human evaluation shows a strong ability to predict relevant types in these settings.

READ FULL TEXT

page 1

page 3

research
04/17/2021

XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment

Cross-lingual named-entity lexicon are an important resource to multilin...
research
12/05/2017

Neural Cross-Lingual Entity Linking

A major challenge in Entity Linking (EL) is making effective use of cont...
research
10/15/2021

A Multilingual Bag-of-Entities Model for Zero-Shot Cross-Lingual Text Classification

We present a multilingual bag-of-entities model that effectively boosts ...
research
05/22/2023

How do languages influence each other? Studying cross-lingual data sharing during LLM fine-tuning

Multilingual large language models (MLLMs) are jointly trained on data f...
research
10/22/2022

EntityCS: Improving Zero-Shot Cross-lingual Transfer with Entity-Centric Code Switching

Accurate alignment between languages is fundamental for improving cross-...
research
05/19/2023

From Alignment to Entailment: A Unified Textual Entailment Framework for Entity Alignment

Entity Alignment (EA) aims to find the equivalent entities between two K...
research
10/23/2020

Evaluating Language Tools for Fifteen EU-official Under-resourced Languages

This article presents the results of the evaluation campaign of language...

Please sign up or login with your details

Forgot password? Click here to reset