DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries

10/23/2020
by   Aditi Chaudhary, et al.
0

Pre-trained multilingual language models such as mBERT have shown immense gains for several natural language processing (NLP) tasks, especially in the zero-shot cross-lingual setting. Most, if not all, of these pre-trained models rely on the masked-language modeling (MLM) objective as the key language learning objective. The principle behind these approaches is that predicting the masked words with the help of the surrounding text helps learn potent contextualized representations. Despite the strong representation learning capability enabled by MLM, we demonstrate an inherent limitation of MLM for multilingual representation learning. In particular, by requiring the model to predict the language-specific token, the MLM objective disincentivizes learning a language-agnostic representation – which is a key goal of multilingual pre-training. Therefore to encourage better cross-lingual representation learning we propose the DICT-MLM method. DICT-MLM works by incentivizing the model to be able to predict not just the original masked word, but potentially any of its cross-lingual synonyms as well. Our empirical analysis on multiple downstream tasks spanning 30+ languages, demonstrates the efficacy of the proposed approach and its ability to learn better multilingual representations.

READ FULL TEXT
research
09/26/2021

XLM-K: Improving Cross-Lingual Language Model Pre-Training with Multilingual Knowledge

Cross-lingual pre-training has achieved great successes using monolingua...
research
05/04/2022

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

The success of multilingual pre-trained models is underpinned by their a...
research
05/31/2022

EMS: Efficient and Effective Massively Multilingual Sentence Representation Learning

Massively multilingual sentence representation models, e.g., LASER, SBER...
research
09/15/2021

On the Universality of Deep COntextual Language Models

Deep Contextual Language Models (LMs) like ELMO, BERT, and their success...
research
05/31/2021

An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers

Most studies on word-level Quality Estimation (QE) of machine translatio...
research
05/12/2016

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

We introduce polyglot language models, recurrent neural network models t...
research
07/31/2020

On Learning Universal Representations Across Languages

Recent studies have demonstrated the overwhelming advantage of cross-lin...

Please sign up or login with your details

Forgot password? Click here to reset