A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning

02/27/2022
by   Gerald Schwiebert, et al.
0

Large datasets as required for deep learning of lip reading do not exist in many languages. In this paper we present the dataset GLips (German Lips) consisting of 250,000 publicly available videos of the faces of speakers of the Hessian Parliament, which was processed for word-level lip reading using an automatic pipeline. The format is similar to that of the English language LRW (Lip Reading in the Wild) dataset, with each video encoding one word of interest in a context of 1.16 seconds duration, which yields compatibility for studying transfer learning between both datasets. By training a deep neural network, we investigate whether lip reading has language-independent features, so that datasets of different languages can be used to improve lip reading models. We demonstrate learning from scratch and show that transfer learning from LRW to GLips and vice versa improves learning speed and performance, in particular for the validation set.

READ FULL TEXT

page 3

page 5

research
04/08/2023

Word-level Persian Lipreading Dataset

Lip-reading has made impressive progress in recent years, driven by adva...
research
09/25/2018

Non-native children speech recognition through transfer learning

This work deals with non-native children's speech and investigates both ...
research
01/10/2022

Transfer Learning for Scene Text Recognition in Indian Languages

Scene text recognition in low-resource Indian languages is challenging b...
research
10/15/2021

Scribosermo: Fast Speech-to-Text models for German and other Languages

Recent Speech-to-Text models often require a large amount of hardware re...
research
10/11/2019

Automatic segmentation of texts into units of meaning for reading assistance

The emergence of the digital book is a major step forward in providing a...
research
06/06/2023

Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics

Automatic assessment of reading fluency using automatic speech recogniti...
research
07/01/2020

Exploiting the Logits: Joint Sign Language Recognition and Spell-Correction

Machine learning techniques have excelled in the automatic semantic anal...

Please sign up or login with your details

Forgot password? Click here to reset