Exploring End-to-End Techniques for Low-Resource Speech Recognition

07/02/2018

∙

In this work we present simple grapheme-based system for low-resource speech recognition using Babel data for Turkish spontaneous speech (80 hours). We have investigated different neural network architectures performance, including fully-convolutional, recurrent and ResNet with GRU. Different features and normalization techniques are compared as well. We also proposed CTC-loss modification using segmentation during training, which leads to improvement while decoding with small beam size. Our best model achieved word error rate of 45.8 data for this task, according to our knowledge.

READ FULL TEXT

Exploring End-to-End Techniques for Low-Resource Speech Recognition

Sign in with Google

Consider DeepAI Pro