Jointly Optimizing Translations and Speech Timing to Improve Isochrony in Automatic Dubbing

02/25/2023
by   Alexandra Chronopoulou, et al.
0

Automatic dubbing (AD) is the task of translating the original speech in a video into target language speech. The new target language speech should satisfy isochrony; that is, the new speech should be time aligned with the original video, including mouth movements, pauses, hand gestures, etc. In this paper, we propose training a model that directly optimizes both the translation as well as the speech duration of the generated translations. We show that this system generates speech that better matches the timing of the original speech, compared to prior work, while simplifying the system architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters

To translate speech for automatic dubbing, machine translation needs to ...
research
08/29/2019

Classifying topics in speech when all you have is crummy translations

Given a large amount of unannotated speech in a language with few resour...
research
02/12/2019

Puppet Dubbing

Dubbing puppet videos to make the characters (e.g. Kermit the Frog) conv...
research
10/08/2021

Machine Translation Verbosity Control for Automatic Dubbing

Automatic dubbing aims at seamlessly replacing the speech in a video doc...
research
09/27/2022

Direct Speech Translation for Automatic Subtitling

Automatic subtitling is the task of automatically translating the speech...
research
10/02/2019

Speech-to-speech Translation between Untranscribed Unknown Languages

In this paper, we explore a method for training speech-to-speech transla...
research
12/23/2022

Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing

We investigate how humans perform the task of dubbing video content from...

Please sign up or login with your details

Forgot password? Click here to reset