Por Qué Não Utiliser Alla Språk? Mixed Training with Gradient Optimization in Few-Shot Cross-Lingual Transfer

04/29/2022
by   Haoran Xu, et al.
0

The current state-of-the-art for few-shot cross-lingual transfer learning first trains on abundant labeled data in the source language and then fine-tunes with a few examples on the target language, termed target-adapting. Though this has been demonstrated to work on a variety of tasks, in this paper we show some deficiencies of this approach and propose a one-step mixed training method that trains on both source and target data with stochastic gradient surgery, a novel gradient-level optimization. Unlike the previous studies that focus on one language at a time when target-adapting, we use one model to handle all target languages simultaneously to avoid excessively language-specific models. Moreover, we discuss the unreality of utilizing large target development sets for model selection in previous literature. We further show that our method is both development-free for target languages, and is also able to escape from overfitting issues. We conduct a large-scale experiment on 4 diverse NLP tasks across up to 48 languages. Our proposed method achieves state-of-the-art performance on all tasks and outperforms target-adapting by a large margin, especially for languages that are linguistically distant from the source language, e.g., 7.36 F1 absolute gain on average for the NER task, up to 17.60

READ FULL TEXT
research
04/26/2020

Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

To better tackle the named entity recognition (NER) problem on languages...
research
04/28/2020

MultiMix: A Robust Data Augmentation Strategy for Cross-Lingual NLP

Transfer learning has yielded state-of-the-art results in many supervise...
research
09/11/2023

Analysing Cross-Lingual Transfer in Low-Resourced African Named Entity Recognition

Transfer learning has led to large gains in performance for nearly all N...
research
06/05/2023

Cross-Lingual Transfer with Target Language-Ready Task Adapters

Adapters have emerged as a modular and parameter-efficient approach to (...
research
08/04/2020

Prompt Agnostic Essay Scorer: A Domain Generalization Approach to Cross-prompt Automated Essay Scoring

Cross-prompt automated essay scoring (AES) requires the system to use no...
research
05/26/2023

Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging

Massively multilingual language models have displayed strong performance...
research
12/21/2020

Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network

By using deep learning approaches, Speech Emotion Recog-nition (SER) on ...

Please sign up or login with your details

Forgot password? Click here to reset