Generating Synthetic Speech from SpokenVocab for Speech Translation

10/15/2022
by   Jinming Zhao, et al.
0

Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is unavailable for most language pairs and domains. One practical solution to the data scarcity issue is to convert machine translation data (MT) to ST data via text-to-speech (TTS) systems. Yet, using TTS systems can be tedious and slow, as the conversion needs to be done for each MT dataset. In this work, we propose a simple, scalable and effective data augmentation technique, i.e., SpokenVocab, to convert MT data to ST data on-the-fly. The idea is to retrieve and stitch audio snippets from a SpokenVocab bank according to words in an MT sequence. Our experiments on multiple language pairs from Must-C show that this method outperforms strong baselines by an average of 1.83 BLEU scores, and it performs equally well as TTS-generated speech. We also showcase how SpokenVocab can be applied in code-switching ST for which often no TTS systems exit. Our code is available at https://github.com/mingzi151/SpokenVocab

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2023

DUB: Discrete Unit Back-translation for Speech Translation

How can speech-to-text translation (ST) perform as well as machine trans...
research
03/22/2023

Selective Data Augmentation for Robust Speech Translation

Speech translation (ST) systems translate speech in one language to text...
research
03/16/2022

Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation

End-to-end speech translation relies on data that pair source-language s...
research
05/08/2023

Target-Side Augmentation for Document-Level Machine Translation

Document-level machine translation faces the challenge of data sparsity ...
research
03/19/2019

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

In this paper, we describe compare-mt, a tool for holistic analysis and ...
research
05/07/2021

Learning Shared Semantic Space for Speech-to-Text Translation

Having numerous potential applications and great impact, end-to-end spee...
research
12/19/2022

WACO: Word-Aligned Contrastive Learning for Speech Translation

End-to-end Speech Translation (E2E ST) aims to translate source speech i...

Please sign up or login with your details

Forgot password? Click here to reset