Enhancing deep neural networks with morphological information

11/24/2020
by   Matej Klemen, et al.
0

Currently, deep learning approaches are superior in natural language processing due to their ability to extract informative features and patterns from languages. Two most successful neural architectures are LSTM and transformers, the latter mostly used in the form of large pretrained language models such as BERT. While cross-lingual approaches are on the rise, a vast majority of current natural language processing techniques is designed and applied to English, and less-resourced languages are lagging behind. In morphologically rich languages, plenty of information is conveyed through changes in morphology, e.g., through different prefixes and suffixes modifying stems of words. The existing neural approaches do not explicitly use the information on word morphology. We analyze the effect of adding morphological features to LSTM and BERT models. We use three tasks available in many less-resourced languages: named entity recognition (NER), dependency parsing (DP), and comment filtering (CF). We construct sensible baselines involving LSTM and BERT models, which we adjust by adding additional input in the form of part of speech (POS) tags and universal features. We compare the obtained models across subsets of eight languages. Our results suggest that adding morphological features has mixed effects depending on the quality of features and the task. The features improve the performance of LSTM-based models on the NER and DP tasks, while they do not benefit the performance on the CF task. For BERT-based models, the added morphological features only improve the performance on DP when they are of high quality, while they do not show any practical improvement when they are predicted. As in NER and CF datasets manually checked features are not available, we only experiment with the predicted morphological features and find that they do not cause any practical improvement in performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2018

Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags

Previous studies have shown that linguistic features of a word such as p...
research
06/02/2020

Exploring Cross-sentence Contexts for Named Entity Recognition with BERT

Named entity recognition (NER) is frequently addressed as a sequence cla...
research
10/09/2018

Learning Noun Cases Using Sequential Neural Networks

Morphological declension, which aims to inflect nouns to indicate number...
research
04/25/2020

Hierarchical Multi Task Learning with Subword Contextual Embeddings for Languages with Rich Morphology

Morphological information is important for many sequence labeling tasks ...
research
03/16/2022

KinyaBERT: a Morphology-aware Kinyarwanda Language Model

Pre-trained language models such as BERT have been successful at tacklin...
research
02/01/2023

On the Role of Morphological Information for Contextual Lemmatization

Lemmatization is a Natural Language Processing (NLP) task which consists...
research
11/09/2021

Tackling Morphological Analogies Using Deep Learning – Extended Version

Analogical proportions are statements of the form "A is to B as C is to ...

Please sign up or login with your details

Forgot password? Click here to reset