ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

05/24/2023
by   Chenyang Le, et al.
0

Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. Particularly, we propose to incorporate cross-modality learning into transfer learning and conduct them simultaneously for downstream tasks in a multi-task learning manner. Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks, achieving a new state-of-the-art average BLEU score of 31.5 on the multilingual speech to English text translation task for 21 languages, as measured on the public CoVoST2 evaluation set.

READ FULL TEXT
research
10/24/2020

Cross-Modal Transfer Learning for Multilingual Speech-to-Text Translation

We propose an effective approach to utilize pretrained speech and text m...
research
07/14/2021

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

In this paper, we describe our end-to-end multilingual speech translatio...
research
06/22/2023

AudioPaLM: A Large Language Model That Can Speak and Listen

We introduce AudioPaLM, a large language model for speech understanding ...
research
08/27/2018

Large Margin Neural Language Model

We propose a large margin criterion for training neural language models....
research
05/16/2023

The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation

End-to-end spoken language understanding (SLU) remains elusive even with...
research
02/27/2019

An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models

A growing number of state-of-the-art transfer learning methods employ la...
research
05/23/2022

KOLD: Korean Offensive Language Dataset

Although large attention has been paid to the detection of hate speech, ...

Please sign up or login with your details

Forgot password? Click here to reset