Ralf Schlüter

research

∙ 09/15/2023

Mixture Encoder Supporting Continuous Speech Separation for Meeting Recognition

Many real-life applications of automatic speech recognition (ASR) requir...

0 Peter Vieting, et al. ∙

research

∙ 09/15/2023

Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition

We study a streamable attention-based encoder-decoder model in which eit...

0 Mohammad Zeineldeen, et al. ∙

research

∙ 08/08/2023

Comparative Analysis of the wav2vec 2.0 Feature Extractor

Automatic speech recognition (ASR) systems typically use handcrafted fea...

0 Peter Vieting, et al. ∙

research

∙ 06/21/2023

Mixture Encoder for Joint Speech Separation and Recognition

Multi-speaker automatic speech recognition (ASR) is crucial for many rea...

0 Simon Berger, et al. ∙

research

∙ 06/15/2023

Competitive and Resource Efficient Factored Hybrid HMM Systems are Simpler Than You Think

Building competitive hybrid hidden Markov model (HMM) systems for automa...

0 Tina Raissi, et al. ∙

research

∙ 05/28/2023

RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition

Modern public ASR tools usually provide rich support for training variou...

0 Wei Zhou, et al. ∙

research

∙ 03/03/2023

End-to-End Speech Recognition: A Survey

In the last decade of automatic speech recognition (ASR) research, the i...

0 Rohit Prabhavalkar, et al. ∙

research

∙ 01/11/2023

Improving And Analyzing Neural Speaker Embeddings for ASR

Neural speaker embeddings encode the speaker's speech characteristics th...

0 Christoph Lüscher, et al. ∙

research

∙ 12/07/2022

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers

Recently, RNN-Transducers have achieved remarkable results on various au...

0 Zijian Yang, et al. ∙

research

∙ 10/26/2022

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Automatic speech recognition (ASR) has been established as a well-perfor...

0 Peter Vieting, et al. ∙

research

∙ 10/26/2022

Monotonic segmental attention for automatic speech recognition

We introduce a novel segmental-attention model for automatic speech reco...

0 Albert Zeyer, et al. ∙

research

∙ 10/24/2022

Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

Language barriers present a great challenge in our increasingly connecte...

0 Christoph Lüscher, et al. ∙

research

∙ 10/18/2022

HMM vs. CTC for Automatic Speech Recognition: Comparison Based on Full-Sum Training from Scratch

In this work, we compare from-scratch sequence-level cross-entropy (full...

0 Tina Raissi, et al. ∙

research

∙ 06/26/2022

Improving the Training Recipe for a Robust Conformer-based Hybrid Model

Speaker adaptation is important to build robust automatic speech recogni...

2 Mohammad Zeineldeen, et al. ∙

research

∙ 04/22/2022

Efficient Training of Neural Transducer for Speech Recognition

As one of the most popular sequence-to-sequence modeling approaches for ...

0 Wei Zhou, et al. ∙

research

∙ 01/24/2022

Improving Factored Hybrid HMM Acoustic Modeling without State Tying

In this work, we show that a factored hybrid hidden Markov model (FH-HMM...

0 Tina Raissi, et al. ∙

research

∙ 11/13/2021

Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Dataset Using (Psycho-)Linguistic and Fluency Features

One of the key communicative competencies is the ability to maintain flu...

0 Yu Qiao, et al. ∙

research

∙ 11/11/2021

Self-Normalized Importance Sampling for Neural Language Modeling

To mitigate the problem of having to traverse over the full vocabulary i...

0 Zijian Yang, et al. ∙

research

∙ 11/05/2021

Conformer-based Hybrid ASR System for Switchboard Dataset

The recently proposed conformer architecture has been successfully used ...

8 Mohammad Zeineldeen, et al. ∙

research

∙ 10/18/2021

Automatic Learning of Subword Dependent Model Scales

To improve the performance of state-of-the-art automatic speech recognit...

11 Felix Meyer, et al. ∙

research

∙ 10/18/2021

Efficient Sequence Training of Attention Models using Approximative Recombination

Sequence discriminative training is a great tool to improve the performa...

0 Nils-Philipp Wynands, et al. ∙

research

∙ 10/13/2021

On Language Model Integration for RNN Transducer based Speech Recognition

The mismatch between an external language model (LM) and the implicitly ...

0 Wei Zhou, et al. ∙

research

∙ 05/31/2021

Why does CTC result in peaky behavior?

The peaky behavior of CTC models is well known experimentally. However, ...

0 Albert Zeyer, et al. ∙

research

∙ 04/21/2021

On Sampling-Based Training Criteria for Neural Language Modeling

As the vocabulary size of modern word-based language models becomes ever...

0 Yingbo Gao, et al. ∙

research

∙ 04/19/2021

Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition

Subword units are commonly used for end-to-end automatic speech recognit...

9 Wei Zhou, et al. ∙

research

∙ 04/17/2021

The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech

In recent years, automated approaches to assessing linguistic complexity...

0 Yu Qiao, et al. ∙

research

∙ 04/13/2021

Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept

With the advent of direct models in automatic speech recognition (ASR), ...

0 Wei Zhou, et al. ∙

research

∙ 04/12/2021

Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models

Attention-based encoder-decoder (AED) models learn an implicit internal ...

13 Mohammad Zeineldeen, et al. ∙

research

∙ 04/12/2021

Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures

Recent publications on automatic-speech-recognition (ASR) have a strong ...

7 Nick Rossenbach, et al. ∙

research

∙ 04/09/2021

Feature Replacement and Combination for Hybrid ASR Systems

Acoustic modeling of raw waveform and learning feature extractors as par...

0 Peter Vieting, et al. ∙

research

∙ 04/07/2021

Librispeech Transducer Model with Internal Language Model Prior Correction

We present our transducer model on Librispeech. We study variants to inc...

0 Albert Zeyer, et al. ∙

research

∙ 04/06/2021

Towards Consistent Hybrid HMM Acoustic Modeling

High-performance hybrid automatic speech recognition (ASR) systems are o...

0 Tina Raissi, et al. ∙

research

∙ 03/30/2021

A study of latent monotonic attention variants

End-to-end models reach state-of-the-art performance for speech recognit...

0 Albert Zeyer, et al. ∙

research

∙ 11/24/2020

Tight Integrated End-to-End Training for Cascaded Speech Translation

A cascaded speech translation model relies on discrete and non-different...

0 Parnia Bahar, et al. ∙

research

∙ 10/30/2020

Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition

To join the advantages of classical and end-to-end approaches for speech...

0 Wei Zhou, et al. ∙

research

∙ 05/20/2020

Investigation of Large-Margin Softmax in Neural Language Modeling

To encourage intra-class compactness and inter-class separability among ...

0 Jingjing Huo, et al. ∙

research

∙ 05/20/2020

Early Stage LM Integration Using Local and Global Log-Linear Combination

Sequence-to-sequence models with an implicit alignment mechanism (e.g. a...

0 Wilfried Michel, et al. ∙

research

∙ 05/19/2020

Investigations on Phoneme-Based End-To-End Speech Recognition

Common end-to-end models like CTC or encoder-decoder-attention models us...

5 Albert Zeyer, et al. ∙

research

∙ 05/19/2020

A New Training Pipeline for an Improved Neural Transducer

The RNN transducer is a promising end-to-end model candidate. We compare...

0 Albert Zeyer, et al. ∙

research

∙ 05/15/2020

Context-Dependent Acoustic Modeling without Explicit Phone Clustering

Phoneme-based acoustic modeling of large vocabulary automatic speech rec...

0 Tina Raissi, et al. ∙

research

∙ 04/02/2020

Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model

In hybrid HMM based speech recognition, LSTM language models have been w...

0 Wei Zhou, et al. ∙

research

∙ 04/02/2020

The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment

We present a complete training pipeline to build a state-of-the-art hybr...

0 Wei Zhou, et al. ∙

research

∙ 12/19/2019

Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems

Recent advances in text-to-speech (TTS) led to the development of flexib...

0 Nick Rossenbach, et al. ∙

research

∙ 11/20/2019

On using 2D sequence-to-sequence models for speech recognition

Attention-based sequence-to-sequence models have shown promising results...

0 Parnia Bahar, et al. ∙

research

∙ 11/20/2019

On Using SpecAugment for End-to-End Speech Translation

This work investigates a simple data augmentation technique, SpecAugment...

0 Parnia Bahar, et al. ∙

research

∙ 07/01/2019

LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring

LSTM based language models are an important part of modern LVCSR systems...

0 Eugen Beck, et al. ∙

research

∙ 07/01/2019

Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Sequence discriminative training criteria have long been a standard tool...

0 Wilfried Michel, et al. ∙

research

∙ 06/14/2019

Cumulative Adaptation for BLSTM Acoustic Models

This paper addresses the robust speech recognition problem as an adaptat...

0 Markus Kitza, et al. ∙

research

∙ 05/10/2019

Language Modeling with Deep Transformers

We explore multi-layer autoregressive Transformer models in language mod...

0 Kazuki Irie, et al. ∙

research

∙ 05/09/2019

Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech

Significant performance degradation of automatic speech recognition (ASR...

0 Tobias Menne, et al. ∙

Ralf Schlüter

Featured Co-authors

Sign in with Google

Consider DeepAI Pro