RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition

05/28/2023
by   Wei Zhou, et al.
0

Modern public ASR tools usually provide rich support for training various sequence-to-sequence (S2S) models, but rather simple support for decoding open-vocabulary scenarios only. For closed-vocabulary scenarios, public tools supporting lexical-constrained decoding are usually only for classical ASR, or do not support all S2S models. To eliminate this restriction on research possibilities such as modeling unit choice, we present RASR2 in this work, a research-oriented generic S2S decoder implemented in C++. It offers a strong flexibility/compatibility for various S2S models, language models, label units/topologies and neural network architectures. It provides efficient decoding for both open- and closed-vocabulary scenarios based on a generalized search framework with rich support for different search modes and settings. We evaluate RASR2 with a wide range of experiments on both switchboard and Librispeech corpora. Our source code is public online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2017

Multilingual Speech Recognition With A Single End-To-End Model

Training a conventional automatic speech recognition (ASR) system to sup...
research
08/15/2017

Comparison of Decoding Strategies for CTC Acoustic Models

Connectionist Temporal Classification has recently attracted a lot of in...
research
07/09/2021

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

Hybrid automatic speech recognition (ASR) models are typically sequentia...
research
05/25/2018

OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models

We present OpenSeq2Seq -- an open-source toolkit for training sequence-t...
research
04/03/2018

Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language Text

Yorùbá is a widely spoken West African language with a writing system ri...
research
03/29/2022

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

Recently, we made available WeNet, a production-oriented end-to-end spee...
research
01/14/2022

A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies

In this study, we present recent developments of models trained with the...

Please sign up or login with your details

Forgot password? Click here to reset