Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

06/01/2023
by   Sameer Khurana, et al.
11

Research in multilingual speech-to-text translation is topical. Having a single model that supports multiple translation tasks is desirable. The goal of this work it to improve cross-lingual transfer learning in multilingual speech-to-text translation via semantic knowledge distillation. We show that by initializing the encoder of the encoder-decoder sequence-to-sequence translation model with SAMU-XLS-R, a multilingual speech transformer encoder trained using multi-modal (speech-text) semantic knowledge distillation, we achieve significantly better cross-lingual task knowledge transfer than the baseline XLS-R, a multilingual speech transformer encoder trained via self-supervised learning. We demonstrate the effectiveness of our approach on two popular datasets, namely, CoVoST-2 and Europarl. On the 21 translation tasks of the CoVoST-2 benchmark, we achieve an average improvement of 12.8 BLEU points over the baselines. In the zero-shot translation scenario, we achieve an average gain of 18.8 and 11.9 average BLEU points on unseen medium and low-resource languages. We make similar observations on Europarl speech translation benchmark.

READ FULL TEXT

page 1

page 10

research
10/24/2020

Cross-Modal Transfer Learning for Multilingual Speech-to-Text Translation

We propose an effective approach to utilize pretrained speech and text m...
research
12/19/2022

Mu^2SLAM: Multitask, Multilingual Speech and Language Models

We present Mu^2SLAM, a multilingual sequence-to-sequence model pre-train...
research
04/18/2021

Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

Previous works mainly focus on improving cross-lingual transfer for NLU ...
research
05/23/2022

The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains

Recent model pruning methods have demonstrated the ability to remove red...
research
02/10/2023

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Multilingual machine translation (MMT) benefits from cross-lingual trans...
research
03/09/2023

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Multi-media communications facilitate global interaction among people. H...
research
02/26/2023

Cross-lingual Knowledge Transfer via Distillation for Multilingual Information Retrieval

In this paper, we introduce the approach behind our submission for the M...

Please sign up or login with your details

Forgot password? Click here to reset