Unit-based Speech-to-Speech Translation Without Parallel Data

05/24/2023
by   Anuj Diwan, et al.
0

We propose an unsupervised speech-to-speech translation (S2ST) system that does not rely on parallel data between the source and target languages. Our approach maps source and target language speech signals into automatically discovered, discrete units and reformulates the problem as unsupervised unit-to-unit machine translation. We develop a three-step training procedure that involves (a) pre-training an unit-based encoder-decoder language model with a denoising objective (b) training it with word-by-word translated utterance pairs created by aligning monolingual text embedding spaces and (c) running unsupervised backtranslation bootstrapping off of the initial translation model. Our approach avoids mapping the speech signal into text and uses speech-to-unit and unit-to-speech models instead of automatic speech recognition and text to speech models. We evaluate our model on synthetic-speaker Europarl-ST English-German and German-English evaluation sets, finding that unit-based translation is feasible under this constrained scenario, achieving 9.29 ASR-BLEU in German to English and 8.07 in English to German.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2023

Back Translation for Speech-to-text Translation Without Transcripts

The success of end-to-end speech-to-text translation (ST) is often achie...
research
09/05/2018

Pre-training on high-resource speech recognition improves low-resource speech-to-text translation

We present a simple approach to improve direct speech-to-text translatio...
research
11/04/2018

Towards Unsupervised Speech-to-Text Translation

We present a framework for building speech-to-text translation (ST) syst...
research
10/13/2016

A Survey of Voice Translation Methodologies - Acoustic Dialect Decoder

Speech Translation has always been about giving source text or audio inp...
research
09/05/2022

Multi-Figurative Language Generation

Figurative language generation is the task of reformulating a given text...
research
06/27/2023

Automatic Annotation of Direct Speech in Written French Narratives

The automatic annotation of direct speech (AADS) in written text has bee...
research
10/05/2022

JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT

JoeyS2T is a JoeyNMT extension for speech-to-text tasks such as automati...

Please sign up or login with your details

Forgot password? Click here to reset