Selective Data Augmentation for Robust Speech Translation

03/22/2023
by   Rajul Acharya, et al.
0

Speech translation (ST) systems translate speech in one language to text in another language. End-to-end ST systems (e2e-ST) have gained popularity over cascade systems because of their enhanced performance due to reduced latency and computational cost. Though resource intensive, e2e-ST systems have the inherent ability to retain para and non-linguistic characteristics of the speech unlike cascade systems. In this paper, we propose to use an e2e architecture for English-Hindi (en-hi) ST. We use two imperfect machine translation (MT) services to translate Libri-trans en text into hi text. While each service gives MT data individually to generate parallel ST data, we propose a data augmentation strategy of noisy MT data to aid robust ST. The main contribution of this paper is the proposal of a data augmentation strategy. We show that this results in better ST (BLEU score) compared to brute force augmentation of MT data. We observed an absolute improvement of 1.59 BLEU score with our approach.

READ FULL TEXT
research
09/14/2019

Leveraging Out-of-Task Data for End-to-End Automatic Speech Translation

For automatic speech translation (AST), end-to-end approaches are outper...
research
09/14/2019

Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade

For automatic speech translation (AST), end-to-end approaches are outper...
research
10/15/2022

Generating Synthetic Speech from SpokenVocab for Speech Translation

Training end-to-end speech translation (ST) systems requires sufficientl...
research
03/16/2022

Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation

End-to-end speech translation relies on data that pair source-language s...
research
03/31/2021

Few-shot learning through contextual data augmentation

Machine translation (MT) models used in industries with constantly chang...
research
05/09/2023

E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation

Text image machine translation (TIMT) aims to translate texts embedded i...
research
05/18/2022

PreQuEL: Quality Estimation of Machine Translation Outputs in Advance

We present the task of PreQuEL, Pre-(Quality-Estimation) Learning. A Pre...

Please sign up or login with your details

Forgot password? Click here to reset