End-to-End Speech Translation of Arabic to English Broadcast News

12/11/2022
by   Fethi Bougares, et al.
0

Speech translation (ST) is the task of directly translating acoustic speech signals in a source language into text in a foreign language. ST task has been addressed, for a long time, using a pipeline approach with two modules : first an Automatic Speech Recognition (ASR) in the source language followed by a text-to-text Machine translation (MT). In the past few years, we have seen a paradigm shift towards the end-to-end approaches using sequence-to-sequence deep neural network models. This paper presents our efforts towards the development of the first Broadcast News end-to-end Arabic to English speech translation system. Starting from independent ASR and MT LDC releases, we were able to identify about 92 hours of Arabic audio recordings for which the manual transcription was also translated into English at the segment level. These data was used to train and compare pipeline and end-to-end speech translation systems under multiple scenarios including transfer learning and data augmentation techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2020

Jointly Trained Transformers models for Spoken Language Translation

Conventional spoken language translation (SLT) systems are pipeline base...
research
09/14/2019

Leveraging Out-of-Task Data for End-to-End Automatic Speech Translation

For automatic speech translation (AST), end-to-end approaches are outper...
research
11/07/2018

Towards Fluent Translations from Disfluent Speech

When translating from speech, special consideration for conversational s...
research
12/25/2021

Multi-Dialect Arabic Speech Recognition

This paper presents the design and development of multi-dialect automati...
research
05/25/2022

Investigating Lexical Replacements for Arabic-English Code-Switched Data Augmentation

Code-switching (CS) poses several challenges to NLP tasks, where data sp...
research
11/24/2015

Spoken Language Translation for Polish

Spoken language translation (SLT) is becoming more important in the incr...
research
06/06/2023

Towards End-to-end Speech-to-text Summarization

Speech-to-text (S2T) summarization is a time-saving technique for filter...

Please sign up or login with your details

Forgot password? Click here to reset