TRAVID: An End-to-End Video Translation Framework

09/20/2023
by   Prottay Kumar Adhikary, et al.
0

In today's globalized world, effective communication with people from diverse linguistic backgrounds has become increasingly crucial. While traditional methods of language translation, such as written text or voice-only translations, can accomplish the task, they often fail to capture the complete context and nuanced information conveyed through nonverbal cues like facial expressions and lip movements. In this paper, we present an end-to-end video translation system that not only translates spoken language but also synchronizes the translated speech with the lip movements of the speaker. Our system focuses on translating educational lectures in various Indian languages, and it is designed to be effective even in low-resource system settings. By incorporating lip movements that align with the target language and matching them with the speaker's voice using voice cloning techniques, our application offers an enhanced experience for students and users. This additional feature creates a more immersive and realistic learning environment, ultimately making the learning process more effective and engaging.

READ FULL TEXT
research
06/09/2022

Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos

In this paper, we propose a neural end-to-end system for voice preservin...
research
11/06/2020

Large-scale multilingual audio visual dubbing

We describe a system for large-scale audiovisual translation and dubbing...
research
02/04/2020

CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus

Spoken language translation has recently witnessed a resurgence in popul...
research
06/28/2022

Show Me Your Face, And I'll Tell You How You Speak

When we speak, the prosody and content of the speech can be inferred fro...
research
07/19/2021

Translatotron 2: Robust direct speech-to-speech translation

We present Translatotron 2, a neural direct speech-to-speech translation...
research
11/29/2022

Learnings from Technological Interventions in a Low Resource Language: Enhancing Information Access in Gondi

The primary obstacle to developing technologies for low-resource languag...
research
10/13/2022

Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar

Since the beginning of the COVID-19 pandemic, remote conferencing and sc...

Please sign up or login with your details

Forgot password? Click here to reset