Fast, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

05/19/2020
by   Frank Zhang, et al.
0

In this work, we first show that on the widely used LibriSpeech benchmark, our transformer-based context-dependent connectionist temporal classification (CTC) system produces state-of-the-art results. We then show that using wordpieces as modeling units combined with CTC training, we can greatly simplify the engineering pipeline compared to conventional frame-based cross-entropy training by excluding all the GMM bootstrapping, decision tree building and force alignment steps, while still achieving very competitive word-error-rate. Additionally, using wordpieces as modeling units can significantly improve runtime efficiency since we can use larger stride without losing accuracy. We further confirm these findings on two internal VideoASR datasets: German, which is similar to English as a fusional language, and Turkish, which is an agglutinative language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2021

Towards Consistent Hybrid HMM Acoustic Modeling

High-performance hybrid automatic speech recognition (ASR) systems are o...
research
10/22/2019

Transformer-based Acoustic Modeling for Hybrid Speech Recognition

We propose and evaluate transformer-based acoustic models (AMs) for hybr...
research
11/05/2020

Improving RNN Transducer Based ASR with Auxiliary Tasks

End-to-end automatic speech recognition (ASR) models with a single neura...
research
09/02/2021

Coarse-To-Fine And Cross-Lingual ASR Transfer

End-to-end neural automatic speech recognition systems achieved recently...
research
01/29/2023

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

Conventional ASR systems use frame-level phoneme posterior to conduct fo...
research
06/15/2023

Competitive and Resource Efficient Factored Hybrid HMM Systems are Simpler Than You Think

Building competitive hybrid hidden Markov model (HMM) systems for automa...
research
12/10/2022

Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin

This paper presents the work of restoring punctuation for ASR transcript...

Please sign up or login with your details

Forgot password? Click here to reset