An Open-Source Dataset and A Multi-Task Model for Malay Named Entity Recognition

09/03/2021
by   Yingwen Fu, et al.
0

Named entity recognition (NER) is a fundamental task of natural language processing (NLP). However, most state-of-the-art research is mainly oriented to high-resource languages such as English and has not been widely applied to low-resource languages. In Malay language, relevant NER resources are limited. In this work, we propose a dataset construction framework, which is based on labeled datasets of homologous languages and iterative optimization, to build a Malay NER dataset (MYNER) comprising 28,991 sentences (over 384 thousand tokens). Additionally, to better integrate boundary information for NER, we propose a multi-task (MT) model with a bidirectional revision (Bi-revision) mechanism for Malay NER task. Specifically, an auxiliary task, boundary detection, is introduced to improve NER training in both explicit and implicit ways. Furthermore, a gated ignoring mechanism is proposed to conduct conditional label transfer and alleviate error propagation by the auxiliary task. Experimental results demonstrate that our model achieves comparable results over baselines on MYNER. The dataset and the model in this paper would be publicly released as a benchmark dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2019

Back Attention Knowledge Transfer for Low-resource Named Entity Recognition

In recent years, great success has been achieved in the field of natural...
research
07/07/2022

AsNER – Annotated Dataset and Baseline for Assamese Named Entity recognition

We present the AsNER, a named entity annotation dataset for low resource...
research
12/19/2022

MANER: Mask Augmented Named Entity Recognition for Extreme Low-Resource Languages

This paper investigates the problem of Named Entity Recognition (NER) fo...
research
03/08/2022

InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Recently, prompt-based methods have achieved significant performance in ...
research
06/10/2023

Enhancing Low Resource NER Using Assisting Language And Transfer Learning

Named Entity Recognition (NER) is a fundamental task in NLP that is used...
research
01/24/2022

BTPK-based learning: An Interpretable Method for Named Entity Recognition

Named entity recognition (NER) is an essential task in natural language ...
research
01/12/2023

A Dataset of Kurdish (Sorani) Named Entities – An Amendment to Kurdish-BLARK Named Entities

Named Entity Recognition (NER) is one of the essential applications of N...

Please sign up or login with your details

Forgot password? Click here to reset