Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training

10/20/2020
by   Renjie Zheng, et al.
0

Simultaneous speech-to-speech translation is widely useful but extremely challenging, since it needs to generate target-language speech concurrently with the source-language speech, with only a few seconds delay. In addition, it needs to continuously translate a stream of sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches accumulate latencies progressively when the speaker talks faster, and introduce unnatural pauses when the speaker talks slower. To overcome these issues, we propose Self-Adaptive Translation (SAT) which flexibly adjusts the length of translations to accommodate different source speech rates. At similar levels of translation quality (as measured by BLEU), our method generates more fluent target speech (as measured by the naturalness metric MOS) with substantially lower latency than the baseline, in both Zh <-> En directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2018

STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency

Simultaneous translation, which translates sentences before they are fin...
research
09/20/2023

Long-Form End-to-End Speech Translation via Latent Alignment Segmentation

Current simultaneous speech translation models can process audio only up...
research
11/22/2022

Average Token Delay: A Latency Metric for Simultaneous Translation

Simultaneous translation is a task in which translation begins before th...
research
12/06/2019

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation

We investigate the problem of simultaneous machine translation of long-f...
research
06/12/2022

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

Simultaneous speech translation (SimulST) systems aim at generating thei...
research
12/15/2021

Textless Speech-to-Speech Translation on Real Data

We present a textless speech-to-speech translation (S2ST) system that ca...
research
10/13/2021

Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems

Simultaneous Speech-to-text Translation (SimulST) systems translate sour...

Please sign up or login with your details

Forgot password? Click here to reset