Acoustic Modeling Using a Shallow CNN-HTSVM Architecture

06/27/2017
by   Christopher Dane Shulby, et al.
0

High-accuracy speech recognition is especially challenging when large datasets are not available. It is possible to bridge this gap with careful and knowledge-driven parsing combined with the biologically inspired CNN and the learning guarantees of the Vapnik Chervonenkis (VC) theory. This work presents a Shallow-CNN-HTSVM (Hierarchical Tree Support Vector Machine classifier) architecture which uses a predefined knowledge-based set of rules with statistical machine learning techniques. Here we show that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion. The CNN-HTSVM acoustic model outperforms traditional GMM-HMM models and the HTSVM structure outperforms a MLP multi-class classifier. More importantly we isolate the performance of the acoustic model and provide results on both the frame and phoneme level considering the true robustness of the model. We show that even with a small amount of data accurate and robust recognition rates can be obtained.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2016

The Microsoft 2016 Conversational Speech Recognition System

We describe Microsoft's conversational speech recognition system, in whi...
research
05/06/2022

A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition

Despite the success of deep learning in speech recognition, multi-dialec...
research
08/21/2017

The Microsoft 2017 Conversational Speech Recognition System

We describe the 2017 version of Microsoft's conversational speech recogn...
research
05/21/2020

Multistream CNN for Robust Acoustic Modeling

This paper presents multistream CNN, a novel neural network architecture...
research
11/08/2012

Multi-input Multi-output Beta Wavelet Network: Modeling of Acoustic Units for Speech Recognition

In this paper, we propose a novel architecture of wavelet network called...
research
03/18/2016

A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition

We study large-scale kernel methods for acoustic modeling and compare to...

Please sign up or login with your details

Forgot password? Click here to reset