DeepAI AI Chat
Log In Sign Up

Attention based end to end Speech Recognition for Voice Search in Hindi and English

11/15/2021
by   Raviraj Joshi, et al.
Flipkart
0

We describe here our work with automatic speech recognition (ASR) in the context of voice search functionality on the Flipkart e-Commerce platform. Starting with the deep learning architecture of Listen-Attend-Spell (LAS), we build upon and expand the model design and attention mechanisms to incorporate innovative approaches including multi-objective training, multi-pass training, and external rescoring using language models and phoneme based losses. We report a relative WER improvement of 15.7 models using these modifications. Overall, we report an improvement of 36.9 over the phoneme-CTC system. The paper also provides an overview of different components that can be tuned in a LAS-based system.

READ FULL TEXT

page 1

page 2

page 3

page 4

07/22/2017

Attention-Based End-to-End Speech Recognition on Voice Search

Recently, there has been an increasing interest in end-to-end speech rec...
06/26/2022

On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode

The streaming automatic speech recognition (ASR) models are more popular...
09/14/2020

EasyASR: A Distributed Machine Learning Platform for End-to-end Automatic Speech Recognition

We present EasyASR, a distributed machine learning platform for training...
12/21/2021

Voice Quality and Pitch Features in Transformer-Based Speech Recognition

Jitter and shimmer measurements have shown to be carriers of voice quali...
12/11/2019

Leveraging End-to-End Speech Recognition with Neural Architecture Search

Deep neural networks (DNNs) have been demonstrated to outperform many tr...
07/27/2018

A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition

Attention-based recurrent neural encoder-decoder models present an elega...
03/12/2020

Hybrid Autoregressive Transducer (hat)

This paper proposes and evaluates the hybrid autoregressive transducer (...