MeetDot: Videoconferencing with Live Translation Captions

09/20/2021
by   Arkady Arkhangorodsky, et al.
0

We present MeetDot, a videoconferencing system with live translation captions overlaid on screen. The system aims to facilitate conversation between people who speak different languages, thereby reducing communication barriers between multilingual participants. Currently, our system supports speech and captions in 4 languages and combines automatic speech recognition (ASR) and machine translation (MT) in a cascade. We use the re-translation strategy to translate the streamed speech, resulting in caption flicker. Additionally, our system has very strict latency requirements to have acceptable call quality. We implement several features to enhance user experience and reduce their cognitive load, such as smooth scrolling captions and reducing caption flicker. The modular architecture allows us to integrate different ASR and MT services in our backend. Our system provides an integrated evaluation suite to optimize key intrinsic evaluation metrics such as accuracy, latency and erasure. Finally, we present an innovative cross-lingual word-guessing game as an extrinsic evaluation metric to measure end-to-end system performance. We plan to make our system open-source for research purposes.

READ FULL TEXT

page 3

page 4

page 6

research
05/30/2020

Dynamic Masking for Improved Stability in Spoken Language Translation

For spoken language translation (SLT) in live scenarios such as conferen...
research
09/14/2019

Leveraging Out-of-Task Data for End-to-End Automatic Speech Translation

For automatic speech translation (AST), end-to-end approaches are outper...
research
04/11/2022

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Neural transducers have been widely used in automatic speech recognition...
research
08/09/2021

The HW-TSC's Offline Speech Translation Systems for IWSLT 2021 Evaluation

This paper describes our work in participation of the IWSLT-2021 offline...
research
10/08/2019

One-To-Many Multilingual End-to-end Speech Translation

Nowadays, training end-to-end neural models for spoken language translat...
research
02/11/2022

Evaluating MT Systems: A Theoretical Framework

This paper outlines a theoretical framework using which different automa...

Please sign up or login with your details

Forgot password? Click here to reset