Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages

05/25/2023
by   Shivanshu Gupta, et al.
0

While impressive performance has been achieved on the task of Answer Sentence Selection (AS2) for English, the same does not hold for languages that lack large labeled datasets. In this work, we propose Cross-Lingual Knowledge Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2 models for low-resource languages in the tasks without the need of labeled data for the target language. To evaluate our method, we introduce 1) Xtr-WikiQA, a translation-based WikiQA dataset for 9 additional languages, and 2) TyDi-AS2, a multilingual AS2 dataset with over 70K questions spanning 8 typologically diverse languages. We conduct extensive experiments on Xtr-WikiQA and TyDi-AS2 with multiple teachers, diverse monolingual and multilingual pretrained language models (PLMs) as students, and both monolingual and multilingual training. The results demonstrate that CLKD either outperforms or rivals even supervised fine-tuning with the same amount of labeled data and a combination of machine translation and the teacher model. Our method can potentially enable stronger AS2 models for low-resource languages, while TyDi-AS2 can serve as the largest multilingual AS2 dataset for further studies in the research community.

READ FULL TEXT

page 7

page 8

page 9

research
10/13/2022

You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models

Multilingual models have been widely used for cross-lingual transfer to ...
research
09/15/2023

Multilingual Sentence-Level Semantic Search using Meta-Distillation Learning

Multilingual semantic search is the task of retrieving relevant contents...
research
05/25/2022

Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages

Scaling multilingual representation learning beyond the hundred most fre...
research
07/16/2023

Cross-Lingual NER for Financial Transaction Data in Low-Resource Languages

We propose an efficient modeling framework for cross-lingual named entit...
research
04/05/2022

Towards Best Practices for Training Multilingual Dense Retrieval Models

Dense retrieval models using a transformer-based bi-encoder design have ...
research
04/08/2020

Structure-Level Knowledge Distillation For Multilingual Sequence Labeling

Multilingual sequence labeling is a task of predicting label sequences u...
research
11/02/2022

Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model

Pre-trained multilingual language models play an important role in cross...

Please sign up or login with your details

Forgot password? Click here to reset