UWSpeech: Speech to Speech Translation for Unwritten Languages

06/14/2020
by   Chen Zhang, et al.
0

Existing speech to speech translation systems heavily rely on the text of target language: they usually translate source language either to target text and then synthesize target speech from text, or directly to target speech with target text for auxiliary training. However, those methods cannot be applied to unwritten target languages, which have no written text or phoneme available. In this paper, we develop a translation system for unwritten languages, named as UWSpeech, which converts target unwritten speech into discrete tokens with a converter, and then translates source-language speech into target discrete tokens with a translator, and finally synthesizes target speech from target discrete tokens with an inverter. We propose a method called XL-VAE, which enhances vector quantized variational autoencoder (VQ-VAE) with cross-lingual (XL) speech recognition, to train the converter and inverter of UWSpeech jointly. Experiments on Fisher Spanish-English conversation translation dataset show that UWSpeech outperforms direct translation and VQ-VAE baseline by about 16 and 10 BLEU points respectively, which demonstrate the advantages and potentials of UWSpeech.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2019

Speech-to-speech Translation between Untranscribed Unknown Languages

In this paper, we explore a method for training speech-to-speech transla...
research
10/31/2022

Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation

Direct speech-to-speech translation (S2ST) is an attractive research top...
research
04/10/2023

Enhancing Speech-to-Speech Translation with Multiple TTS Targets

It has been known that direct speech-to-speech translation (S2ST) models...
research
10/15/2021

Direct simultaneous speech to speech translation

We present the first direct simultaneous speech-to-speech translation (S...
research
09/21/2020

SDST: Successive Decoding for Speech-to-text Translation

End-to-end speech-to-text translation (ST), which directly translates th...
research
06/27/2023

Automatic Annotation of Direct Speech in Written French Narratives

The automatic annotation of direct speech (AADS) in written text has bee...
research
06/29/2023

Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data

We propose a method for speech-to-speech emotionpreserving translation t...

Please sign up or login with your details

Forgot password? Click here to reset