On Learning Universal Representations Across Languages

07/31/2020
by   Xiangpeng Wei, et al.
0

Recent studies have demonstrated the overwhelming advantage of cross-lingual pre-trained models (PTMs), such as multilingual BERT and XLM, on cross-lingual NLP tasks. However, existing approaches essentially capture the co-occurrence among tokens through involving the masked language model (MLM) objective with token-level cross entropy. In this work, we extend these approaches to learn sentence-level representations, and show the effectiveness on cross-lingual understanding and generation. We propose Hierarchical Contrastive Learning (HiCTL) to (1) learn universal representations for parallel sentences distributed in one or multiple languages and (2) distinguish the semantically-related words from a shared cross-lingual vocabulary for each sentence. We conduct evaluations on three benchmarks: language understanding tasks (QQP, QNLI, SST-2, MRPC, STS-B and MNLI) in the GLUE benchmark, cross-lingual natural language inference (XNLI) and machine translation. Experimental results show that the HiCTL obtains an absolute gain of 1.0 accuracy on GLUE/XNLI as well as achieves substantial improvements of +1.7-+3.6 BLEU on both the high-resource and low-resource English-to-X translation tasks over strong baselines. We will release the source codes as soon as possible.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2019

Cross-lingual Language Model Pretraining

Recent studies have demonstrated the efficiency of generative pretrainin...
research
04/17/2023

VECO 2.0: Cross-lingual Language Model Pre-training with Multi-granularity Contrastive Learning

Recent studies have demonstrated the potential of cross-lingual transfer...
research
04/03/2020

XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

In this paper, we introduce XGLUE, a new benchmark dataset to train larg...
research
04/28/2020

Self-Attention with Cross-Lingual Position Representation

Position encoding (PE), an essential part of self-attention networks (SA...
research
10/23/2020

DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries

Pre-trained multilingual language models such as mBERT have shown immens...
research
04/18/2022

GL-CLeF: A Global-Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding

Due to high data demands of current methods, attention to zero-shot cros...
research
02/24/2023

Cross-Lingual Transfer of Cognitive Processing Complexity

When humans read a text, their eye movements are influenced by the struc...

Please sign up or login with your details

Forgot password? Click here to reset