Cross-lingual Extended Named Entity Classification of Wikipedia Articles

10/07/2020
by   The Viet Bui, et al.
0

The FPT.AI team participated in the SHINRA2020-ML subtask of the NTCIR-15 SHINRA task. This paper describes our method to solving the problem and discusses the official results. Our method focuses on learning cross-lingual representations, both on the word level and document level for page classification. We propose a three-stage approach including multilingual model pre-training, monolingual model fine-tuning and cross-lingual voting. Our system is able to achieve the best scores for 25 out of 30 languages; and its accuracy gaps to the best performing systems of the other five languages are relatively small.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2019

Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks

We present Unicoder, a universal language encoder that is insensitive to...
research
05/04/2020

WikiUMLS: Aligning UMLS to Wikipedia via Cross-lingual Neural Ranking

We present our work on aligning the Unified Medical Language System (UML...
research
04/17/2022

kpfriends at SemEval-2022 Task 2: NEAMER – Named Entity Augmented Multi-word Expression Recognizer

We present NEAMER – Named Entity Augmented Multi-word Expression Recogni...
research
12/22/2015

News Across Languages - Cross-Lingual Document Similarity and Event Tracking

In today's world, we follow news which is distributed globally. Signific...
research
05/12/2021

Priberam Labs at the NTCIR-15 SHINRA2020-ML: Classification Task

Wikipedia is an online encyclopedia available in 285 languages. It compo...
research
02/28/2023

Extending English IR methods to multi-lingual IR

This paper describes our participation in the 2023 WSDM CUP - MIRACL cha...

Please sign up or login with your details

Forgot password? Click here to reset