El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing

01/26/2021
by   Arash Einolghozati, et al.
0

Being able to parse code-switched (CS) utterances, such as Spanish+English or Hindi+English, is essential to democratize task-oriented semantic parsing systems for certain locales. In this work, we focus on Spanglish (Spanish+English) and release a dataset, CSTOP, containing 5800 CS utterances alongside their semantic parses. We examine the CS generalizability of various Cross-lingual (XL) models and exhibit the advantage of pre-trained XL language models when data for only one language is present. As such, we focus on improving the pre-trained models for the case when only English corpus alongside either zero or a few CS training instances are available. We propose two data augmentation methods for the zero-shot and the few-shot settings: fine-tune using translate-and-align and augment using a generation model followed by match-and-filter. Combining the few-shot setting with the above improvements decreases the initial 30-point accuracy gap between the zero-shot and the full-data settings by two thirds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2021

Translate Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data

While multilingual pretrained language models (LMs) fine-tuned on a sing...
research
12/21/2022

ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models

We explore the use of large language models (LLMs) for zero-shot semanti...
research
06/10/2022

Cross-TOP: Zero-Shot Cross-Schema Task-Oriented Parsing

Deep learning methods have enabled task-oriented semantic parsing of inc...
research
11/14/2022

CST5: Data Augmentation for Code-Switched Semantic Parsing

Extending semantic parsers to code-switched input has been a challenging...
research
10/22/2022

EntityCS: Improving Zero-Shot Cross-lingual Transfer with Entity-Centric Code Switching

Accurate alignment between languages is fundamental for improving cross-...
research
03/06/2023

Towards Zero-Shot Functional Compositionality of Language Models

Large Pre-trained Language Models (PLM) have become the most desirable s...
research
04/26/2023

Neuro-symbolic Zero-Shot Code Cloning with Cross-Language Intermediate Representation

In this paper, we define a neuro-symbolic approach to address the task o...

Please sign up or login with your details

Forgot password? Click here to reset