Fully Convolutional Speech Recognition

12/17/2018
by   Neil Zeghidour, et al.
0

Current state-of-the-art speech recognition systems build on recurrent neural networks for acoustic and/or language modeling, and rely on feature extraction pipelines to extract mel-filterbanks or cepstral coefficients. In this paper we present an alternative approach based solely on convolutional neural networks, leveraging recent advances in acoustic models from the raw waveform and language modeling. This fully convolutional approach is trained end-to-end to predict characters from the raw waveform, removing the feature extraction step altogether. An external convolutional language model is used to decode words. On Wall Street Journal, our model matches the current state-of-the-art. On Librispeech, we report state-of-the-art performance among end-to-end models, including Deep Speech 2 trained with 12 times more acoustic data and significantly more linguistic data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2016

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

We study the segmental recurrent neural network for end-to-end acoustic ...
research
04/05/2019

Jasper: An End-to-End Convolutional Neural Acoustic Model

In this paper, we report state-of-the-art results on LibriSpeech among e...
research
06/19/2018

End-to-End Speech Recognition From the Raw Waveform

State-of-the-art speech recognition systems rely on fixed, hand-crafted ...
research
08/08/2023

Comparative Analysis of the wav2vec 2.0 Feature Extractor

Automatic speech recognition (ASR) systems typically use handcrafted fea...
research
06/29/2022

DDKtor: Automatic Diadochokinetic Speech Analysis

Diadochokinetic speech tasks (DDK), in which participants repeatedly pro...
research
08/10/2018

Densely Connected Convolutional Networks for Speech Recognition

This paper presents our latest investigation on Densely Connected Convol...
research
03/09/2016

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

We present recursive recurrent neural networks with attention modeling (...

Please sign up or login with your details

Forgot password? Click here to reset