DeepAI AI Chat
Log In Sign Up

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

by   Jason Phang, et al.
NYU college

Intermediate-task training has been shown to substantially improve pretrained model performance on many language understanding tasks, at least in monolingual English settings. Here, we investigate whether English intermediate-task training is still helpful on non-English target tasks in a zero-shot cross-lingual setting. Using a set of 7 intermediate language understanding tasks, we evaluate intermediate-task transfer in a zero-shot cross-lingual setting on 9 target tasks from the XTREME benchmark. Intermediate-task training yields large improvements on the BUCC and Tatoeba tasks that use model representations directly without training, and moderate improvements on question-answering target tasks. Using SQuAD for intermediate training achieves the best results across target tasks, with an average improvement of 8.4 points on development sets. Selecting the best intermediate task model for each target task, we obtain a 6.1 point improvement over XLM-R Large on the XTREME benchmark, setting a new state of the art. Finally, we show that neither multi-task intermediate-task training nor continuing multilingual MLM during intermediate-task training offer significant improvements.


Is Prompt-Based Finetuning Always Better than Vanilla Finetuning? Insights from Cross-Lingual Language Understanding

Multilingual pretrained language models (MPLMs) have demonstrated substa...

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

Traditional NLP has long held (supervised) syntactic parsing necessary f...

Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU

Curriculum Learning (CL) is a technique of training models via ranking e...

Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces

Hate speech and profanity detection suffer from data sparsity, especiall...

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language und...

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

Large-scale cross-lingual language models (LM), such as mBERT, Unicoder ...

A Closer Look at Few-Shot Crosslingual Transfer: Variance, Benchmarks and Baselines

We present a focused study of few-shot crosslingual transfer, a recently...