Open Source Automatic Speech Recognition for German

07/26/2018
by   Benjamin Milde, et al.
0

High quality Automatic Speech Recognition (ASR) is a prerequisite for speech-based applications and research. While state-of-the-art ASR software is freely available, the language dependent acoustic models are lacking for languages other than English, due to the limited amount of freely available training data. We train acoustic models for German with Kaldi on two datasets, which are both distributed under a Creative Commons license. The resulting model is freely redistributable, lowering the cost of entry for German ASR. The models are trained on a total of 412 hours of German read speech data and we achieve a relative word error reduction of 26 Wikipedia Corpus to the previously best freely available German acoustic model recipe and dataset. Our best model achieves a word error rate of 14.38 on the Tuda-De test set. Due to the large amount of speakers and the diversity of topics included in the training data, our model is robust against speaker variation and topic shift.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2018

TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

In this paper, we present TED-LIUM release 3 corpus dedicated to speech ...
research
07/17/2020

CTC-Segmentation of Large Corpora for German End-to-end Speech Recognition

Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrate...
research
01/16/2023

Using Kaldi for Automatic Speech Recognition of Conversational Austrian German

As dialogue systems are becoming more and more interactional and social,...
research
04/12/2022

ASR in German: A Detailed Error Analysis

The amount of freely available systems for automatic speech recognition ...
research
06/25/2022

TEVR: Improving Speech Recognition by Token Entropy Variance Reduction

This paper presents TEVR, a speech recognition model designed to minimiz...
research
10/22/2020

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

Is pushing numbers on a single benchmark valuable in automatic speech re...

Please sign up or login with your details

Forgot password? Click here to reset