SpeechAlign: a Framework for Speech Translation Alignment Evaluation

09/20/2023
by   Belen Alastruey, et al.
0

Speech-to-Speech and Speech-to-Text translation are currently dynamic areas of research. To contribute to these fields, we present SpeechAlign, a framework to evaluate the underexplored field of source-target alignment in speech models. Our framework has two core components. First, to tackle the absence of suitable evaluation datasets, we introduce the Speech Gold Alignment dataset, built upon a English-German text translation gold alignment dataset. Secondly, we introduce two novel metrics, Speech Alignment Error Rate (SAER) and Time-weighted Speech Alignment Error Rate (TW-SAER), to evaluate alignment quality in speech models. By publishing SpeechAlign we provide an accessible evaluation framework for model assessment, and we employ it to benchmark open-source Speech Translation models.

READ FULL TEXT
research
10/17/2019

LibriVoxDeEn: A Corpus for German-to-English Speech Translation and Speech Recognition

We present a corpus of sentence-aligned triples of German audio, German ...
research
05/30/2023

Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling

The study of speech disorders can benefit greatly from time-aligned data...
research
04/24/2019

Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation

We propose a variation to the commonly used Word Error Rate (WER) metric...
research
05/29/2020

Neural Simultaneous Speech Translation Using Alignment-Based Chunking

In simultaneous machine translation, the objective is to determine when ...
research
04/06/2022

Prosodic Alignment for off-screen automatic dubbing

The goal of automatic dubbing is to perform speech-to-speech translation...
research
05/22/2023

Improving Metrics for Speech Translation

We introduce Parallel Paraphrasing (Para_both), an augmentation method f...
research
06/28/2022

On the Impact of Noises in Crowd-Sourced Data for Speech Translation

Training speech translation (ST) models requires large and high-quality ...

Please sign up or login with your details

Forgot password? Click here to reset