Neural Modeling for Named Entities and Morphology (NEMO^2)

07/30/2020
by   Dan Bareket, et al.
0

Named Entity Recognition (NER) is a fundamental NLP task, commonly formulated as classification over a sequence of tokens. Morphologically-Rich Languages (MRLs) pose a challenge to this basic formulation, as the boundaries of Named Entities do not coincide with token boundaries, rather, they respect morphological boundaries. To address NER in MRLs we then need to answer two fundamental modeling questions: (i) What should be the basic units to be identified and labeled, are they token-based or morpheme-based? and (ii) How can morphological units be encoded and accurately obtained in realistic (non-gold) scenarios? We empirically investigate these questions on a novel parallel NER benchmark we deliver, with parallel token-level and morpheme-level NER annotations for Modern Hebrew, a morphologically complex language. Our results show that explicitly modeling morphological boundaries consistently leads to improved NER performance, and that a novel hybrid architecture that we propose, in which NER precedes and prunes the morphological decomposition (MD) space, greatly outperforms the standard pipeline approach, on both Hebrew NER and Hebrew MD in realistic scenarios.

READ FULL TEXT

page 9

page 10

page 12

research
07/17/2018

Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags

Previous studies have shown that linguistic features of a word such as p...
research
09/11/2021

AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations

State-of-the-art Named Entity Recognition(NER) models rely heavily on la...
research
12/17/2020

Named Entity Recognition in the Legal Domain using a Pointer Generator Network

Named Entity Recognition (NER) is the task of identifying and classifyin...
research
05/12/2022

NER-MQMRC: Formulating Named Entity Recognition as Multi Question Machine Reading Comprehension

NER has been traditionally formulated as a sequence labeling task. Howev...
research
04/25/2020

Hierarchical Multi Task Learning with Subword Contextual Embeddings for Languages with Rich Morphology

Morphological information is important for many sequence labeling tasks ...
research
08/17/2023

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

Cross-lingual named entity recognition (CrossNER) faces challenges stemm...
research
08/29/2019

Remedying BiLSTM-CNN Deficiency in Modeling Cross-Context for NER

Recent researches prevalently used BiLSTM-CNN as a core module for NER i...

Please sign up or login with your details

Forgot password? Click here to reset