The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation

05/16/2023
by   Mutian He, et al.
0

End-to-end spoken language understanding (SLU) remains elusive even with current large pretrained language models on text and speech, especially in multilingual cases. Machine translation has been established as a powerful pretraining objective on text as it enables the model to capture high-level semantics of the input utterance and associations between different languages, which is desired for speech models that work on lower-level acoustic frames. Motivated particularly by the task of cross-lingual SLU, we demonstrate that the task of speech translation (ST) is a good means of pretraining speech models for end-to-end SLU on both monolingual and cross-lingual scenarios. By introducing ST, our models give higher performance over current baselines on monolingual and multilingual intent classification as well as spoken question answering using SLURP, MINDS-14, and NMSQA benchmarks. To verify the effectiveness of our methods, we also release two new benchmark datasets from both synthetic and real sources, for the tasks of abstractive summarization from speech and low-resource or zero-shot transfer from English to French. We further show the value of preserving knowledge from the pretraining task, and explore Bayesian transfer learning on pretrained speech models based on continual learning regularizers for that.

READ FULL TEXT
research
04/18/2021

AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Pretrained multilingual models are able to perform cross-lingual transfe...
research
04/30/2020

MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer

The main goal behind state-of-the-art pretrained multilingual models suc...
research
05/07/2022

Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

Although spoken language understanding (SLU) has achieved great success ...
research
09/10/2021

AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages

Reproducible benchmarks are crucial in driving progress of machine trans...
research
06/14/2023

ITALIC: An Italian Intent Classification Dataset

Recent large-scale Spoken Language Understanding datasets focus predomin...
research
05/24/2023

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

Joint speech-language training is challenging due to the large demand fo...
research
04/03/2019

Cross-lingual transfer learning for spoken language understanding

Typically, spoken language understanding (SLU) models are trained on ann...

Please sign up or login with your details

Forgot password? Click here to reset