Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation

02/10/2021
by   Renjie Zheng, et al.
2

Recently text and speech representation learning has successfully improved many language related tasks. However, all existing methods only learn from one input modality, while a unified acoustic and text representation is desired by many speech-related tasks such as speech translation. We propose a Fused Acoustic and Text Masked Language Model (FAT-MLM) which jointly learns a unified representation for both acoustic and text in-put. Within this cross modal representation learning framework, we further present an end-to-end model for Fused Acoustic and Text Speech Translation (FAT-ST). Experiments on three translation directions show that our proposed speech translation models fine-tuned from FAT-MLM substantially improve translation quality (+5.90 BLEU).

READ FULL TEXT

page 3

page 5

research
10/28/2020

Bridging the Modality Gap for Speech-to-Text Translation

End-to-end speech translation aims to translate speech in one language i...
research
05/23/2023

Improving speech translation by fusing speech and text

In speech translation, leveraging multimodal data to improve model perfo...
research
06/09/2022

Revisiting End-to-End Speech-to-Text Translation From Scratch

End-to-end (E2E) speech-to-text translation (ST) often depends on pretra...
research
03/18/2022

A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

Recently, speech representation learning has improved many speech-relate...
research
03/20/2022

STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation

How to learn a better speech representation for end-to-end speech-to-tex...
research
09/21/2020

SDST: Successive Decoding for Speech-to-text Translation

End-to-end speech-to-text translation (ST), which directly translates th...
research
10/23/2019

Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks

Self-attention network (SAN) can benefit significantly from the bi-direc...

Please sign up or login with your details

Forgot password? Click here to reset