Evaluating Byte and Wordpiece Level Models for Massively Multilingual Semantic Parsing

12/14/2022
by   Massimo Nicosia, et al.
0

Token free approaches have been successfully applied to a series of word and span level tasks. In this work, we compare a byte-level (ByT5) and a wordpiece based (mT5) sequence to sequence model on the 51 languages of the MASSIVE multilingual semantic parsing dataset. We examine multiple experimental settings: (i) zero-shot, (ii) full gold data and (iii) zero-shot with synthetic data. By leveraging a state-of-the-art label projection method for machine translated examples, we are able to reduce the gap in exact match accuracy to only 5 points with respect to a model trained on gold data from all the languages. We additionally provide insights on the cross-lingual transfer of ByT5 and show how the model compares with respect to mT5 across all parameter sizes.

READ FULL TEXT
research
09/09/2021

Translate Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data

While multilingual pretrained language models (LMs) fine-tuned on a sing...
research
06/07/2021

Multilingual Neural Semantic Parsing for Low-Resourced Languages

Multilingual semantic parsing is a cost-effective method that allows a s...
research
04/15/2021

Zero-Shot Cross-lingual Semantic Parsing

Recent work in crosslingual semantic parsing has successfully applied ma...
research
10/23/2020

Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Cross-lingual adaptation with multilingual pre-trained language models (...
research
03/16/2022

Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning

Large multilingual pretrained language models such as mBERT and XLM-RoBE...
research
08/02/2022

Multilingual Coreference Resolution in Multiparty Dialogue

Existing multiparty dialogue datasets for coreference resolution are nas...
research
03/01/2021

On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions

Recent complementary strands of research have shown that leveraging info...

Please sign up or login with your details

Forgot password? Click here to reset