SDST: Successive Decoding for Speech-to-text Translation

09/21/2020
by   Qianqian Dong, et al.
0

End-to-end speech-to-text translation (ST), which directly translates the source language speech to the target language text, has attracted intensive attention recently. However, the combination of speech recognition and machine translation in a single model poses a heavy burden on the direct cross-modal cross-lingual mapping. To reduce the learning difficulty, we propose SDST, an integral framework with Successive Decoding for end-to-end Speech-to-text Translation task. This method is verified in two mainstream datasets. Experiments show that our proposed improves the previous state-of-the-art methods by big margins.

READ FULL TEXT
research
05/05/2022

Cross-modal Contrastive Learning for Speech Translation

How can we learn unified representations for spoken utterances and their...
research
02/12/2018

End-to-End Automatic Speech Translation of Audiobooks

We investigate end-to-end speech-to-text translation on a corpus of audi...
research
12/06/2016

Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

This paper proposes a first attempt to build an end-to-end speech-to-tex...
research
04/21/2021

End-to-end Speech Translation via Cross-modal Progressive Training

End-to-end speech translation models have become a new trend in the rese...
research
02/10/2021

Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation

Recently text and speech representation learning has successfully improv...
research
06/14/2020

UWSpeech: Speech to Speech Translation for Unwritten Languages

Existing speech to speech translation systems heavily rely on the text o...
research
06/01/2020

Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

Subtitling is becoming increasingly important for disseminating informat...

Please sign up or login with your details

Forgot password? Click here to reset