Low-Resource Speech-to-Text Translation

03/24/2018
by   Sameer Bansal, et al.
0

Speech-to-text translation has many potential applications for low-resource languages, but the typical approach of cascading speech recognition with machine translation is often impossible, since the transcripts needed to train a speech recognizer are usually not available for low-resource languages. Recent work has found that neural encoder-decoder models can learn to directly translate foreign speech in high-resource scenarios, without the need for intermediate transcription. We investigate whether this approach also works in settings where both data and computation are limited. To make the approach efficient, we make several architectural changes, including a change from character-level to word-level decoding. We find that this choice yields crucial speed improvements that allow us to train with fewer computational resources, yet still performs well on frequent words. We explore models trained on between 20 and 160 hours of data, and find that although models trained on less data have considerably lower BLEU scores, they can still predict words with relatively high precision and recall---around 50 hours of data, versus around 60 still be useful for some low-resource scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2023

Strategies for improving low resource speech to text translation relying on pre-trained ASR models

This paper presents techniques and findings for improving the performanc...
research
05/04/2022

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks

This paper describes the ON-TRAC Consortium translation systems develope...
research
09/21/2021

On the Difficulty of Segmenting Words with Attention

Word segmentation, the problem of finding word boundaries in speech, is ...
research
09/17/2017

Unwritten Languages Demand Attention Too! Word Discovery with Encoder-Decoder Models

Word discovery is the task of extracting words from unsegmented text. In...
research
06/08/2021

Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings

When documenting oral-languages, Unsupervised Word Segmentation (UWS) fr...
research
02/14/2017

A case study on using speech-to-translation alignments for language documentation

For many low-resource or endangered languages, spoken language resources...
research
02/19/2018

Tied Multitask Learning for Neural Speech Translation

We explore multitask models for neural translation of speech, augmenting...

Please sign up or login with your details

Forgot password? Click here to reset