Synchronous Bidirectional Learning for Multilingual Lip Reading

05/08/2020
by   Mingshuang Luo, et al.
0

Lip reading has received increasing attention in recent years. This paper focuses on the synergy of multilingual lip reading. There are more than 7,000 languages in the world, which implies that it is impractical to train separate lip reading models by collecting large-scale data per language. Although each language has its own linguistic and pronunciation features, the lip movements of all languages share similar patterns. Based on this idea, in this paper, we try to explore the synergized learning of multilingual lip reading, and further propose a synchronous bidirectional learning(SBL) framework for effective synergy of multilingual lip reading. Firstly, we introduce the phonemes as our modeling units for the multilingual setting. Similar phoneme always leads to similar visual patterns. The multilingual setting would increase both the quantity and the diversity of each phoneme shared among different languages. So the learning for the multilingual target should bring improvement to the prediction of phonemes. Then, a SBL block is proposed to infer the target unit when given its previous and later context. The rules for each specific language which the model itself judges to be is learned in this fill-in-the-blank manner. To make the learning process more targeted at each particular language, we introduce an extra task of predicting the language identity in the learning process. Finally, we perform a thorough comparison on LRW (English) and LRW-1000(Mandarin). The results outperform the existing state of the art by a large margin, and show the promising benefits from the synergized learning of different languages.

READ FULL TEXT
research
07/17/2023

Multilingual Speech-to-Speech Translation into Multiple Target Languages

Speech-to-speech translation (S2ST) enables spoken communication between...
research
03/12/2020

Deformation Flow Based Two-Stream Network for Lip Reading

Lip reading is the task of recognizing the speech content by analyzing m...
research
02/22/2022

NU HLT at CMCL 2022 Shared Task: Multilingual and Crosslingual Prediction of Human Reading Behavior in Universal Language Space

In this paper, we present a unified model that works for both multilingu...
research
12/09/2022

BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model

The BigScience Workshop was a value-driven initiative that spanned one a...
research
02/01/2023

Inference of Partial Colexifications from Multilingual Wordlists

The past years have seen a drastic rise in studies devoted to the invest...
research
05/24/2023

MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition

Traditional Multilingual Text Recognition (MLTR) usually targets a fixed...
research
07/07/2023

Testing the Predictions of Surprisal Theory in 11 Languages

A fundamental result in psycholinguistics is that less predictable words...

Please sign up or login with your details

Forgot password? Click here to reset