Are the Multilingual Models Better? Improving Czech Sentiment with Transformers

08/24/2021
by   Pavel Přibáň, et al.
0

In this paper, we aim at improving Czech sentiment with transformer-based models and their multilingual versions. More concretely, we study the task of polarity detection for the Czech language on three sentiment polarity datasets. We fine-tune and perform experiments with five multilingual and three monolingual models. We compare the monolingual and multilingual models' performance, including comparison with the older approach based on recurrent neural networks. Furthermore, we test the multilingual models and their ability to transfer knowledge from English to Czech (and vice versa) with zero-shot cross-lingual classification. Our experiments show that the huge multilingual models can overcome the performance of the monolingual models. They are also able to detect polarity in another language without any training data, with performance not worse than 4.4 trained models. Moreover, we achieved new state-of-the-art results on all three datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2022

Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware Training

Multilingual pre-trained language models (MPLMs) not only can handle tas...
research
06/19/2021

Transformers for Headline Selection for Russian News Clusters

In this paper, we explore various multilingual and Russian pre-trained t...
research
02/25/2023

Locale Encoding For Scalable Multilingual Keyword Spotting Models

A Multilingual Keyword Spotting (KWS) system detects spokenkeywords over...
research
02/28/2023

Augmented Transformers with Adaptive n-grams Embedding for Multilingual Scene Text Recognition

While vision transformers have been highly successful in improving the p...
research
05/19/2020

Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text

Although temporal tagging is still dominated by rule-based systems, ther...
research
04/11/2022

Assessment of Massively Multilingual Sentiment Classifiers

Models are increasing in size and complexity in the hunt for SOTA. But w...
research
05/02/2023

MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset

Sentence Boundary Detection (SBD) is one of the foundational building bl...

Please sign up or login with your details

Forgot password? Click here to reset