Linguistic Search Optimization for Deep Learning Based LVCSR

08/02/2018
by   Zhehuai Chen, et al.
0

Recent advances in deep learning based large vocabulary con- tinuous speech recognition (LVCSR) invoke growing demands in large scale speech transcription. The inference process of a speech recognizer is to find a sequence of labels whose corresponding acoustic and language models best match the input feature [1]. The main computation includes two stages: acoustic model (AM) inference and linguistic search (weighted finite-state transducer, WFST). Large computational overheads of both stages hamper the wide application of LVCSR. Benefit from stronger classifiers, deep learning, and more powerful computing devices, we propose general ideas and some initial trials to solve these fundamental problems.

READ FULL TEXT
research
08/02/2018

Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting

Speech recognition is a sequence prediction problem. Besides employing v...
research
05/06/2022

A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition

Despite the success of deep learning in speech recognition, multi-dialec...
research
04/25/2018

Recent Progresses in Deep Learning based Acoustic Models (Updated)

In this paper, we summarize recent progresses made in deep learning base...
research
08/15/2017

Comparison of Decoding Strategies for CTC Acoustic Models

Connectionist Temporal Classification has recently attracted a lot of in...
research
06/23/2022

Restoring speech intelligibility for hearing aid users with deep learning

Almost half a billion people world-wide suffer from disabling hearing lo...
research
04/07/2022

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

General accent recognition (AR) models tend to directly extract low-leve...

Please sign up or login with your details

Forgot password? Click here to reset