CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

11/07/2018
by   Nelson Yalta, et al.
0

Casual conversations involving multiple speakers and noises from surrounding devices are part of everyday environments and pose challenges for automatic speech recognition systems. These challenges in speech recognition are target for the CHiME-5 challenge. In the present study, an attempt is made to overcome these challenges by employing a convolutional neural network (CNN)-based multichannel end-to-end speech recognition system. The system comprises an attention-based encoder-decoder neural network that directly generates a text as an output from a sound input. The mulitchannel CNN encoder, which uses residual connections and batch renormalization, is trained with augmented data, including white noise injection. The experimental results show that the word error rate (WER) was reduced by 11.9

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2017

Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

We present a state-of-the-art end-to-end Automatic Speech Recognition (A...
research
07/22/2017

Attention-Based End-to-End Speech Recognition on Voice Search

Recently, there has been an increasing interest in end-to-end speech rec...
research
10/29/2018

Improved hybrid CTC-Attention model for speech recognition

Recently, end-to-end speech recognition with a hybrid model consisting o...
research
03/13/2018

LCANet: End-to-End Lipreading with Cascaded Attention-CTC

Machine lipreading is a special type of automatic speech recognition (AS...
research
03/28/2020

Serialized Output Training for End-to-End Overlapped Speech Recognition

This paper proposes serialized output training (SOT), a novel framework ...
research
09/22/2020

End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic Hands

Speech is one of the most common forms of communication in humans. Speec...
research
05/07/2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

Convolutional neural networks (CNN) have shown promising results for end...

Please sign up or login with your details

Forgot password? Click here to reset