Very Deep Multilingual Convolutional Neural Networks for LVCSR

09/29/2015
by   Tom Sercu, et al.
0

Convolutional neural networks (CNNs) are a standard component of many current state-of-the-art Large Vocabulary Continuous Speech Recognition (LVCSR) systems. However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance. In this paper we propose a number of architectural advances in CNNs for LVCSR. First, we introduce a very deep convolutional network architecture with up to 14 weight layers. There are multiple convolutional layers before each pooling layer, with small 3x3 kernels, inspired by the VGG Imagenet 2014 architecture. Then, we introduce multilingual CNNs with multiple untied layers. Finally, we introduce multi-scale input features aimed at exploiting more context at negligible computational cost. We evaluate the improvements first on a Babel task for low resource speech recognition, obtaining an absolute 5.77 improvement over the baseline PLP DNN by training our CNN on the combined data of six different languages. We then evaluate the very deep CNNs on the Hub5'00 benchmark (using the 262 hours of SWB-1 training data) achieving a word error rate of 11.8 relative) over the best published CNN result so far.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2013

Improvements to deep convolutional neural networks for LVCSR

Deep Convolutional Neural Networks (CNNs) are more powerful than Deep Ne...
research
10/02/2016

Very Deep Convolutional Neural Networks for Robust Speech Recognition

This paper describes the extension and optimization of our previous work...
research
04/06/2016

Advances in Very Deep Convolutional Neural Networks for LVCSR

Very deep CNNs with small 3x3 kernels have recently been shown to achiev...
research
12/14/2016

Beam Search for Learning a Deep Convolutional Neural Network of 3D Shapes

This paper addresses 3D shape recognition. Recent work typically represe...
research
04/26/2017

Deep Convolutional Neural Network to Detect J-UNIWARD

This paper presents an empirical study on applying convolutional neural ...
research
10/15/2019

Analyzing Large Receptive Field Convolutional Networks for Distant Speech Recognition

Despite significant efforts over the last few years to build a robust au...
research
03/19/2017

Multilevel Context Representation for Improving Object Recognition

In this work, we propose the combined usage of low- and high-level block...

Please sign up or login with your details

Forgot password? Click here to reset