Translate Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data

09/09/2021
by   Massimo Nicosia, et al.
0

While multilingual pretrained language models (LMs) fine-tuned on a single language have shown substantial cross-lingual task transfer capabilities, there is still a wide performance gap in semantic parsing tasks when target language supervision is available. In this paper, we propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parser. This method simplifies the popular Translate-Align-Project (TAP) pipeline and consists of a sequence-to-sequence filler model that constructs a full parse conditioned on an utterance and a view of the same parse. Our filler is trained on English data only but can accurately complete instances in other languages (i.e., translations of the English training utterances), in a zero-shot fashion. Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF reaches accuracies competitive with similar systems which rely on traditional alignment techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2021

El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing

Being able to parse code-switched (CS) utterances, such as Spanish+Engli...
research
12/14/2022

Evaluating Byte and Wordpiece Level Models for Massively Multilingual Semantic Parsing

Token free approaches have been successfully applied to a series of word...
research
12/15/2022

DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue

Modern virtual assistants use internal semantic parsing engines to conve...
research
05/15/2022

SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models

Recent research showed promising results on combining pretrained languag...
research
08/07/2021

Multilingual Compositional Wikidata Questions

Semantic parsing allows humans to leverage vast knowledge resources thro...
research
06/30/2022

"Diversity and Uncertainty in Moderation" are the Key to Data Selection for Multilingual Few-shot Transfer

Few-shot transfer often shows substantial gain over zero-shot transfer <...
research
11/03/2020

Generating Synthetic Data for Task-Oriented Semantic Parsing with Hierarchical Representations

Modern conversational AI systems support natural language understanding ...

Please sign up or login with your details

Forgot password? Click here to reset