A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition

10/26/2018 ∙ by Xuerui Yang, et al. ∙ 0

Deep Feedforward Sequential Memory Network (DFSMN) has shown superior performance on speech recognition tasks. Based on this work, we propose a novel network architecture which introduces pyramidal memory structure to represent various context information. Additionally, res-CNN layers are added in the front to extract more sophisticated features as well. Together with lattice-free maximum mutual information (LF-MMI) and cross entropy (CE) joint training criteria, experimental results show that this approach achieves word error rates (WERs) of 3.62 Switchboard corpora. Furthermore, Recurrent neural network language model (RNNLM) rescoring is applied and a WER of above 1 obtained.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.