Neural Named Entity Recognition from Subword Units

08/22/2018
by   Abdalghani Abujabal, et al.
0

Named entity recognition (NER) is a vital task in language technology. Existing neural models for NER rely mostly on dedicated word-level representations, which suffer from two main shortcomings: 1) the vocabulary size is large, yielding large memory requirements and training time, and 2) they cannot learn morphological representations. We adopt a neural solution based on bidirectional LSTMs and conditional random fields, where we rely on subword units, namely characters, phonemes, and bytes, to remedy the above shortcomings. We conducted experiments on a large dataset covering four languages with up to 5.5M utterances per language. Our experiments show that 1) with increasing training data, performance of models trained solely on subword units becomes closer to that of models with dedicated word-level embeddings (91.35 vs 93.92 F1 for English), while using a much smaller vocabulary size (332 vs 74K), 2) subword units enhance models with dedicated word-level embeddings, and 3) combining different subword units improves performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2016

Neural Architectures for Named Entity Recognition

State-of-the-art named entity recognition systems rely heavily on hand-c...
research
11/07/2018

microNER: A Micro-Service for German Named Entity Recognition based on BiLSTM-CRF

For named entity recognition (NER), bidirectional recurrent neural netwo...
research
07/17/2018

Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags

Previous studies have shown that linguistic features of a word such as p...
research
04/05/2019

Effective Context and Fragment Feature Usage for Named Entity Recognition

In this paper, we explore a new approach to named entity recognition (NE...
research
03/06/2020

Improving Neural Named Entity Recognition with Gazetteers

The goal of this work is to improve the performance of a neural named en...
research
07/17/2020

Neural Named Entity Recognition for Kazakh

We present several neural networks to address the task of named entity r...
research
07/23/2020

Exploring Swedish English fastText Embeddings with the Transformer

In this paper, our main contributions are that embeddings from relativel...

Please sign up or login with your details

Forgot password? Click here to reset