ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language Generation

06/03/2021
by   Kaushal Kumar Maurya, et al.
0

Despite the recent advancement in NLP research, cross-lingual transfer for natural language generation is relatively understudied. In this work, we transfer supervision from high resource language (HRL) to multiple low-resource languages (LRLs) for natural language generation (NLG). We consider four NLG tasks (text summarization, question generation, news headline generation, and distractor generation) and three syntactically diverse languages, i.e., English, Hindi, and Japanese. We propose an unsupervised cross-lingual language generation framework (called ZmBART) that does not use any parallel or pseudo-parallel/back-translated data. In this framework, we further pre-train mBART sequence-to-sequence denoising auto-encoder model with an auxiliary task using monolingual data of three languages. The objective function of the auxiliary task is close to the target tasks which enriches the multi-lingual latent representation of mBART and provides good initialization for target tasks. Then, this model is fine-tuned with task-specific supervised English data and directly evaluated with low-resource languages in the Zero-shot setting. To overcome catastrophic forgetting and spurious correlation issues, we applied freezing model component and data argumentation approaches respectively. This simple modeling approach gave us promising results.We experimented with few-shot training (with 1000 supervised data points) which boosted the model performance further. We performed several ablations and cross-lingual transferability analyses to demonstrate the robustness of ZmBART.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2019

Cross-Lingual Natural Language Generation via Pre-Training

In this work we focus on transferring supervision signals of natural lan...
research
03/19/2022

Meta-X_NLG: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation

Recently, the NLP community has witnessed a rapid advancement in multili...
research
12/08/2020

Cross-lingual Approach to Abstractive Summarization

Automatic text summarization extracts important information from texts a...
research
04/17/2022

Ìtàkúròso: Exploiting Cross-Lingual Transferability for Natural Language Generation of Dialogues in Low-Resource, African Languages

We investigate the possibility of cross-lingual transfer from a state-of...
research
06/04/2021

Language Scaling for Universal Suggested Replies Model

We consider the problem of scaling automated suggested replies for Outlo...
research
04/03/2018

Multi-lingual neural title generation for e-Commerce browse pages

To provide better access of the inventory to buyers and better search en...
research
06/07/2023

Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes

This paper proposes Allophant, a multilingual phoneme recognizer. It req...

Please sign up or login with your details

Forgot password? Click here to reset