English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

05/26/2020
by   Jason Phang, et al.
0

Intermediate-task training has been shown to substantially improve pretrained model performance on many language understanding tasks, at least in monolingual English settings. Here, we investigate whether English intermediate-task training is still helpful on non-English target tasks in a zero-shot cross-lingual setting. Using a set of 7 intermediate language understanding tasks, we evaluate intermediate-task transfer in a zero-shot cross-lingual setting on 9 target tasks from the XTREME benchmark. Intermediate-task training yields large improvements on the BUCC and Tatoeba tasks that use model representations directly without training, and moderate improvements on question-answering target tasks. Using SQuAD for intermediate training achieves the best results across target tasks, with an average improvement of 8.4 points on development sets. Selecting the best intermediate task model for each target task, we obtain a 6.1 point improvement over XLM-R Large on the XTREME benchmark, setting a new state of the art. Finally, we show that neither multi-task intermediate-task training nor continuing multilingual MLM during intermediate-task training offer significant improvements.

READ FULL TEXT
research
07/15/2023

Is Prompt-Based Finetuning Always Better than Vanilla Finetuning? Insights from Cross-Lingual Language Understanding

Multilingual pretrained language models (MPLMs) have demonstrated substa...
research
08/15/2020

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

Traditional NLP has long held (supervised) syntactic parsing necessary f...
research
10/22/2022

Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU

Curriculum Learning (CL) is a technique of training models via ranking e...
research
06/14/2021

Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces

Hate speech and profanity detection suffer from data sparsity, especiall...
research
05/23/2023

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language und...
research
09/10/2020

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

Large-scale cross-lingual language models (LM), such as mBERT, Unicoder ...
research
12/31/2020

A Closer Look at Few-Shot Crosslingual Transfer: Variance, Benchmarks and Baselines

We present a focused study of few-shot crosslingual transfer, a recently...

Please sign up or login with your details

Forgot password? Click here to reset