Bootstrapping Multilingual Semantic Parsers using Large Language Models

10/13/2022
by   Abhijeet Awasthi, et al.
0

Despite cross-lingual generalization demonstrated by pre-trained multilingual models, the translate-train paradigm of transferring English datasets across multiple languages remains to be the key ingredient for training task-specific multilingual models. However, for many low-resource languages, the availability of a reliable translation service entails significant amounts of costly human-annotated translation pairs. Further, the translation services for low-resource languages may continue to be brittle due to domain mismatch between the task-specific input text and the general-purpose text used while training the translation models. We consider the task of multilingual semantic parsing and demonstrate the effectiveness and flexibility offered by large language models (LLMs) for translating English datasets into several languages via few-shot prompting. We provide (i) Extensive comparisons with prior translate-train methods across 50 languages demonstrating that LLMs can serve as highly effective data translators, outperforming prior translation based methods on 40 out of 50 languages; (ii) A comprehensive study of the key design choices that enable effective data translation via prompted LLMs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2023

Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin

Developing effective spoken language processing systems for low-resource...
research
01/16/2023

XNLI 2.0: Improving XNLI dataset and performance on Cross Lingual Understanding (XLU)

Natural Language Processing systems are heavily dependent on the availab...
research
08/02/2023

Do Multilingual Language Models Think Better in English?

Translate-test is a popular technique to improve the performance of mult...
research
05/29/2023

BigTrans: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

Large language models (LLMs) demonstrate promising translation performan...
research
06/07/2021

Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study

Recent research in multilingual language models (LM) has demonstrated th...
research
05/05/2023

Train Global, Tailor Local: Minimalist Multilingual Translation into Endangered Languages

In many humanitarian scenarios, translation into severely low resource l...
research
12/01/2022

A Commonsense-Infused Language-Agnostic Learning Framework for Enhancing Prediction of Political Polarity in Multilingual News Headlines

Predicting the political polarity of news headlines is a challenging tas...

Please sign up or login with your details

Forgot password? Click here to reset