Evaluating Transformer-Based Multilingual Text Classification

04/29/2020
by   Sophie Groenwold, et al.
0

As NLP tools become ubiquitous in today's technological landscape, they are increasingly applied to languages with a variety of typological structures. However, NLP research does not focus primarily on typological differences in its analysis of state-of-the-art language models. As a result, NLP tools perform unequally across languages with different syntactic and morphological structures. Through a detailed discussion of word order typology, morphological typology, and comparative linguistics, we identify which variables most affect language modeling efficacy; in addition, we calculate word order and morphological similarity indices to aid our empirical study. We then use this background to support our analysis of an experiment we conduct using multi-class text classification on eight languages and eight models.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

04/29/2020

Evaluating the Role of Language Typology in Transformer-Based Multilingual Text Classification

As NLP tools become ubiquitous in today's technological landscape, they ...
08/09/2021

On the Transferability of Neural Models of Morphological Analogies

Analogical proportions are statements expressed in the form "A is to B a...
02/13/2020

Comparison of Turkish Word Representations Trained on Different Morphological Forms

Increased popularity of different text representations has also brought ...
11/04/2020

Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages

Language models based on the Transformer architecture have achieved stat...
05/02/2022

Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language

Graph Convolutional Networks (GCN) have achieved state-of-art results on...
05/15/2021

A Cognitive Regularizer for Language Modeling

The uniform information density (UID) hypothesis, which posits that spea...
02/19/2021

Formal Language Theory Meets Modern NLP

NLP is deeply intertwined with the formal study of language, both concep...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.