Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation

06/04/2019
by   Elizabeth Salesky, et al.
0

Previous work on end-to-end translation from speech has primarily used frame-level features as speech representations, which creates longer, sparser sequences than text. We show that a naive method to create compressed phoneme-like speech representations is far more effective and efficient for translation than traditional frame-level speech features. Specifically, we generate phoneme labels for speech frames and average consecutive frames with the same label to create shorter, higher-level source sequences for translation. We see improvements of up to 5 BLEU on both our high and low resource language pairs, with a reduction in training time of 60 improvements hold across multiple data sizes and two language pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2018

End-to-End Automatic Speech Translation of Audiobooks

We investigate end-to-end speech-to-text translation on a corpus of audi...
research
03/24/2022

Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation

End-to-end speech-to-speech translation (S2ST) without relying on interm...
research
06/22/2020

Self-Supervised Representations Improve End-to-End Speech Translation

End-to-end speech-to-text translation can provide a simpler and smaller ...
research
05/27/2020

Phone Features Improve Speech Translation

End-to-end models for speech translation (ST) more tightly couple speech...
research
11/20/2019

On Using SpecAugment for End-to-End Speech Translation

This work investigates a simple data augmentation technique, SpecAugment...
research
09/09/2021

Speechformer: Reducing Information Loss in Direct Speech Translation

Transformer-based models have gained increasing popularity achieving sta...
research
12/19/2022

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

Data scarcity is one of the main issues with the end-to-end approach for...

Please sign up or login with your details

Forgot password? Click here to reset