Scribosermo: Fast Speech-to-Text models for German and other Languages

10/15/2021
by   Daniel Bermuth, et al.
0

Recent Speech-to-Text models often require a large amount of hardware resources and are mostly trained in English. This paper presents Speech-to-Text models for German, as well as for Spanish and French with special features: (a) They are small and run in real-time on microcontrollers like a RaspberryPi. (b) Using a pretrained English model, they can be trained on consumer-grade hardware with a relatively small dataset. (c) The models are competitive with other solutions and outperform them in German. In this respect, the models combine advantages of other approaches, which only include a subset of the presented features. Furthermore, the paper provides a new library for handling datasets, which is focused on easy extension with additional datasets and shows an optimized way for transfer-learning new languages using a pretrained model from another language with a similar alphabet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2021

Effects of Layer Freezing when Transferring DeepSpeech to New Languages

In this paper, we train Mozilla's DeepSpeech architecture on German and ...
research
10/09/2019

Spoken Language Identification using ConvNets

Language Identification (LI) is an important first step in several speec...
research
04/22/2022

LibriS2S: A German-English Speech-to-Speech Translation Corpus

Recently, we have seen an increasing interest in the area of speech-to-t...
research
05/31/2023

Text-to-Speech Pipeline for Swiss German – A comparison

In this work, we studied the synthesis of Swiss German speech using diff...
research
07/13/2022

A Transfer Learning Based Model for Text Readability Assessment in German

Text readability assessment has a wide range of applications for differe...
research
02/27/2022

A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning

Large datasets as required for deep learning of lip reading do not exist...
research
06/11/2021

Sprachsynthese – State-of-the-Art in englischer und deutscher Sprache

Reading text aloud is an important feature for modern computer applicati...

Please sign up or login with your details

Forgot password? Click here to reset