ANEC: An Amharic Named Entity Corpus and Transformer Based Recognizer

07/02/2022
by   Ebrahim Chekol Jibril, et al.
0

Named Entity Recognition is an information extraction task that serves as a preprocessing step for other natural language processing tasks, such as machine translation, information retrieval, and question answering. Named entity recognition enables the identification of proper names as well as temporal and numeric expressions in an open domain text. For Semitic languages such as Arabic, Amharic, and Hebrew, the named entity recognition task is more challenging due to the heavily inflected structure of these languages. In this paper, we present an Amharic named entity recognition system based on bidirectional long short-term memory with a conditional random fields layer. We annotate a new Amharic named entity recognition dataset (8,070 sentences, which has 182,691 tokens) and apply Synthetic Minority Over-sampling Technique to our dataset to mitigate the imbalanced classification problem. Our named entity recognition system achieves an F_1 score of 93 state-of-the-art result for Amharic named entity recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2017

The Importance of Automatic Syntactic Features in Vietnamese Named Entity Recognition

This paper presents a state-of-the-art system for Vietnamese Named Entit...
research
09/28/2019

Named Entity Recognition System for Sindhi Language

Named Entity Recognition (NER) System aims to extract the existing infor...
research
04/06/2023

Using LSTM and GRU With a New Dataset for Named Entity Recognition in the Arabic Language

Named entity recognition (NER) is a natural language processing task (NL...
research
10/18/2016

Vietnamese Named Entity Recognition using Token Regular Expressions and Bidirectional Inference

This paper describes an efficient approach to improve the accuracy of a ...
research
10/12/2021

Investigation on Data Adaptation Techniques for Neural Named Entity Recognition

Data processing is an important step in various natural language process...
research
04/15/2021

UIT-E10dot3 at SemEval-2021 Task 5: Toxic Spans Detection with Named Entity Recognition and Question-Answering Approaches

The increment of toxic comments on online space is causing tremendous ef...
research
08/24/2016

Robust Named Entity Recognition in Idiosyncratic Domains

Named entity recognition often fails in idiosyncratic domains. That caus...

Please sign up or login with your details

Forgot password? Click here to reset