Design Challenges in Named Entity Transliteration

08/07/2018
by   Yuval Merhav, et al.
0

We analyze some of the fundamental design challenges that impact the development of a multilingual state-of-the-art named entity transliteration system, including curating bilingual named entity datasets and evaluation of multiple transliteration methods. We empirically evaluate the transliteration task using traditional weighted finite state transducer (WFST) approach against two neural approaches: the encoder-decoder recurrent neural network method and the recent, non-sequential Transformer method. In order to improve availability of bilingual named entity transliteration datasets, we release personal name bilingual dictionaries minded from Wikidata for English to Russian, Hebrew, Arabic and Japanese Katakana. Our code and dictionaries are publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2019

ANETAC: Arabic Named Entity Transliteration and Classification Dataset

In this paper, we make freely accessible ANETAC our English-Arabic named...
research
02/19/2023

Exploring the Potential of Machine Translation for Generating Named Entity Datasets: A Case Study between Persian and English

This study focuses on the generation of Persian named entity datasets th...
research
08/01/2017

A Lightweight Front-end Tool for Interactive Entity Population

Entity population, a task of collecting entities that belong to a partic...
research
09/03/2019

Modeling Named Entity Embedding Distribution into Hypersphere

This work models named entity distribution from a way of visualizing top...
research
09/27/2021

Controllable Neural Dialogue Summarization with Personal Named Entity Planning

In this paper, we propose a controllable neural generation framework tha...
research
04/17/2022

kpfriends at SemEval-2022 Task 2: NEAMER – Named Entity Augmented Multi-word Expression Recognizer

We present NEAMER – Named Entity Augmented Multi-word Expression Recogni...
research
06/28/2017

Named Entity Disambiguation for Noisy Text

We address the task of Named Entity Disambiguation (NED) for noisy text....

Please sign up or login with your details

Forgot password? Click here to reset