Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation

07/19/2022
by   Félix Gaschi, et al.
0

Some Transformer-based models can perform cross-lingual transfer learning: those models can be trained on a specific task in one language and give relatively good results on the same task in another language, despite having been pre-trained on monolingual tasks only. But, there is no consensus yet on whether those transformer-based models learn universal patterns across languages. We propose a word-level task-agnostic method to evaluate the alignment of contextualized representations built by such models. We show that our method provides more accurate translated word pairs than previous methods to evaluate word-level alignment. And our results show that some inner layers of multilingual Transformer-based models outperform other explicitly aligned representations, and even more so according to a stricter definition of multilingual alignment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2023

Multilingual Sentence Transformer as A Multilingual Word Aligner

Multilingual pretrained language models (mPLMs) have shown their effecti...
research
04/15/2021

Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining

This work presents methods for learning cross-lingual sentence represent...
research
09/10/2021

A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations

Language agnostic and semantic-language information isolation is an emer...
research
05/23/2022

Use of Transformer-Based Models for Word-Level Transliteration of the Book of the Dean of Lismore

The Book of the Dean of Lismore (BDL) is a 16th-century Scottish Gaelic ...
research
09/12/2021

Levenshtein Training for Word-level Quality Estimation

We propose a novel scheme to use the Levenshtein Transformer to perform ...
research
09/11/2021

The Impact of Positional Encodings on Multilingual Compression

In order to preserve word-order information in a non-autoregressive sett...
research
09/30/2021

Multi-granular Legal Topic Classification on Greek Legislation

In this work, we study the task of classifying legal texts written in th...

Please sign up or login with your details

Forgot password? Click here to reset